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(57) Abstract 

The present invention provides a purified protein 
designated P/CAF having a molecular weight of about 
93.000 daltons as determined by sodium dodecyl sul- 
fate polyacrylamide gel electrophoresis under reducing 
conditions and which acetylates histones and which also 
binds to the p300/CBP cellular protein. T^c present 
invention further provides a nucleic acid encoding the 
P/CAF protein as well as a vector containing the nu- 
cleic acid and a host for the vector. A purified antibody 
which specifically binds the P/CAF protein is also pro- 
vided. Also provided are methods of screening for com- 
pounds that inhibit or stimulate the transcription mod- 
ulating and histone acetyltransferase activity of P/CAF 
and p300/CBP. 
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P300/CBP. ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF AND USES THEREOF 
BACKGROUND OF THE INVENTION 

5 

Field of the Invention 

The present invention provides a transcriptional co-factor, p300/CBP-associated 
factor (P/CAF), which modulates transcription through binding to the cellular 
transcription co-factors p300 and CBP and through acetylation of histones. Also 
10 provided are methods for screening for the presence of P/CAF and for substances which 
alter the transcription modulating effect and growth regulatory activity of P/CAF. 

Background Art 

Cellular proteins p300 and CBP are global transcriptional coactivators that are 
1 5 involved in the regulation of various DNA-binding transcriptional factors (Janknecht and 
Hunter, 1996). Recently, p300 was found to be very closely related to CBP, a factor 
that binds selectively to the protein kinase A-phosphorylated form of GREB (3-5). 
Cellular factors p300 and CBP exhibit strong amino acid sequence similarity and share 
the capacity to bind both CREB and El A (6-8). Although neither p300 nor CBP by 
20 itself binds to DNA, each can be recruited to promoter elements via interaction with 
sequence-specific activators and functions to be a transcriptional adaptor. For 
simplicity, p300 and CBP will be termed p300/CBP in the context of discussing their 
shared functional properties. 

25 p300/CBP is a large protein consisting of over 2,400 amino acids, known to 

interact with a variety of DNA-binding transcriptional factors including nuclear hormone 
receptors (13,57), CREB (3,4, 7), c-Jun/v-Jun (9,11), YYl (10), c-Myb/v-Myb (12,58), 
Sap- la (59), c-Fos (1 1) and MyoD (60). DNA-binding factors recruit p300/CBP not 
only by direct but also indirect interactions through cofactors, for example, nuclear 

30 hormone receptors recruit p300/CBP directly as well as through indirect interactions, via 
SRC-1, which stimulates transcription by binding to various nuclear hormone receptors 
(13,61). 
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The transforming proteins encoded by adenovirus and several other small DNA 
tumor viruses disturb host cell growth control by interacting with cellular factors that 
normally function to repress cell proliferation. One of the most intensively studied of 
these viral proteins, the product of the adenovirus El A gene, is itself sufficient for 
transformation (1). El A transforming activity resides in two distinct domains, the 
targets of which include p300/CBP and products of the retinoblastoma (RB) 
susceptibility gene family (1,2). Interactions of El A with p300/CBP and RB are 
thought to influence ftinctionally distinct growth regulatory pathways, allowing the two 
domains to contribute additively to transformation (1). 



10 



The paradigm for how El A and functionally related viral proteins perturb cell 
growth regulation derives in large part from studies on their interactions with RB (1,2) 
The molecular function of El A is based on its capacity to interfere with cellular protein- 
protein interactions. Since both El A and various cellular targets bind to a site in RB 
1 5 termed the pocket domain (2), El A can competitively disrupt the complex formation 
between RB and its cellular targets. 

The second cellular factor implicated in E 1 A-dependent transformation, p300, is 
believed to inhibit GO/Gl exit, to activate certain enhancers, and to stimulate 
20 differentiation (1,2). El A inhibits the p300/CBP-mediated transcriptional activation of 
many promoters (14). In one case that has been examined, the complex of p300 and 
YYl, El A inhibits transcription without disrupting the complex (10). 

The present invention provides a cellular protein designated P/CAF which binds 
25 to p300/CBP and plays an important role in both transcription and cell cycle regulation 
associated with a histone acetyltransferase activity The present invention also provides 
a histone acetyhransferase activity in the p300/CBP cellular protein, thus providing 
targets for modulating transcription and cell cycle regulation in cells. 



30 
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SUMMARY OF THE INVENTION 



The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
5 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones and which also binds to the p300/CBP cellular protein. 

The present invention further provides a nucleic acid encoding the P/CAF 
protein as well as a vector containing the nucleic acid and a host for the vector A 
10 purified antibody which specifically binds the P/CAF protein is also provided 

In addition, also provided is a bioassay for screening substances for the ability to 
inhibit the transcription modulating activity of P/CAF and/or histone acetyltransferase 
activity, comprising contacting the substance with a system in which histone acetylation 
15 by P/CAF can be determined; determining the amount of histone acetylation by P/CAF 
in the presence of the substance, and comparing the amount of histone acetylation by 
P/CAF in the presence of the substance with the amount of histone acetylation by 
P/CAF in the absence of the substance, a decreased amount of histone acetylation by 
--P/CAF in the presence of the substance indicating a substance that can inhibit the 
20 transcription modulating activity and/or histone acetyltransferase activity of P/CAF. 

Furthermore, the present invention provides a bioassay for screening substances 
for the ability to inhibit the transcription modulating activity and/or histone 
acetyltransferase activity of P/CAF comprising contacting the substance with a system in 

25 which the p300 binding of P/CAF can be determined; determining the amount of p300 
binding of P/CAF in the presence of the substance; and comparing the amount of p300 
binding of P/CAF in the presence of the substance with the amount of p300 binding of 
P/CAF in the absence of the substance, a decreased amount of p300 binding of P/CAF in 
the presence of the substance indicating a substance that can inhibit the transcription 

30 modulating activity and/or histone acetyltransferase activity of P/CAF. 




% 
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Also provided is a method for determining the amount of P/CAF in a biological 
sample comprising contacting the biological sample with a polypeptide comprising the 
amino acid sequence of SEQ ID NO:3 under conditions whereby a P/CAF/p300 
complex can be formed; and determining the amount of the P/CAF/p300 complex, the 
5 amount of the complex indicating the amount of P/CAF' in the sample. 

The present invention additionally provides a method for determining the amount 
of P/CAF in a biological sample comprising contacting the biological sample with an 
antibody which specifically binds P/CAF under conditions whereby a P/CAF/antibody 
10 complex can be formed; and determining the amount of the P/CAF/antibody complex, 
the amount of the complex indicating the amount of P/CAF in the sample. 



Also provided herein is an assay for screening substances for the abihty to inhibit 
or stimulate the histone acetyltransferase activity of P/CAF, comprising; contacting the 

1 5 substance with a system in which histone acetylation by P/CAF can be determined, 
determining the amount of histone acetylation by P/CAF in the presence of the 
substance; and comparing the amount of histone acetylation by P/CAF in the presence of 
the substance with the amount of histone acetylation by P/CAF in the absence of the 
substance, a decreased or increased amount of histone acetylation by P/CAF in the 

20 presence of the substance indicating a substance that can inhibit or stimulate, 
respectively, the histone acetyltransferase activity of P/CAF 

The present invention further provides an assay for screening substances for the 
ability to inhibit binding of P/CAF to p300/CBP comprising; contacting the substance 

25 with a system in which the P/CAF binding of P300/CBP can be determined; determining 
the amount of P/CAF binding of p300/CBP in the presence of the substance, and 
comparing the amount of binding of P/CAF to p300/CBP in the presence of the 
substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/CAF to p300/CBP in the presence of the 

30 substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 
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In addition, an assay is provided for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of p300/CBP, comprising: contacting 
the substance with a system in which histone acetylation by p300/CBP can be 
5 determined, determining the amount of histone acetylation by p300/CBP in the presence 
of the substance; and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
10 stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 

Furthermore, the present invention provides an assay for screening substances 
for the ability to inhibit binding of a DNA-binding transcription factor to p300/CBP 
comprising: contacting the substance with a system in which the DNA-binding 

15 transcription factor binding of P300/CBP can be determined; determining the amount of 
_ DNA-binding transcription factor binding of p3Q0/CBP in the presence of the substance; 
and comparing the amount of binding of DNA-binding transcription factor to p300/CBP 
in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 

20 binding of DNA-binding transcription factor to p300/CBP in the presence of the 

substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

A method is also provided for inhibiting the transcription modulating activity of 
25 P/CAF in a subject, comprising administering to the subject a transcription modulating 
activity inhibiting amount of a substance in a pharmaceutically acceptable carrier. 

Also provided in the present invention is a method for stimulating the 
transcription modulating activity of P/CAF in a subject, comprising administering to the 
30 subject a transcription modulating activity stimulating amount of a substance in a 
pharmaceutically acceptable carrier. 
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Furthermore, the present invention provides a method for inhibiting the histone 
acetyltransferase activity of p300/CBP in a subject, comprising administering to the 
subject a histone acetyltransferase activity inhibiting amount of a substance in a 
5 pharmaceutically acceptable carrier. 

Finally, the present invention additionally provides a method for stimulating the 
histone acetyhransferase activity of p300/CBP in a subject, comprising administering to 
the subject a histone acetyltransferase activity stimulating amount of a substance in a 
10 pharmaceutically acceptable carrier. 

BRIEF DESCRIPTION OF THE FIGURES 

Figs. lA-B. Fig lA: P/CAF-p300/CBP interaction /n vivo. Cell extract was 
15 immunoprecipitated with rabbit anti-P/CAF (lanes 1, 4, and 7), rabbit anti-CBP (lanes 2 
and 5), and mouse anti-p300 (lane 9) antibodies For fcontrols, cell extract was 
precipitated with rabbit control IgG (lanes 3, 6, and 8) or mouse anti-HA monoclonal 
antibody (lane 10) The precipitates were analyzed by immunoblotting with anti-P/CAF 
(lanes 1-3). anti-CBP (lanes 4-6), and anti-p300 (lanes 7-10) antibodies. The positions 

20 of non-specific bands are indicated by asterisks. Fig 1 B : E I A inhibits the P/C AF-p300 
interaction in vivo Osteosarcoma cells were transfected with either control vector 
(lanes 1 and 4) or ElA- (lanes 2 and 5) or El AAN- (lanes 3 and 6) expression vectors. 
Extract from the transfected subpopulation was immunoprecipitated with anti-P/CAF 
(lanes 1-3) or control (lanes 4-6) IgG. The precipitates were analyzed by 

25 immunoblotting wdth anti-p300 and anti-P/CAF. 

Figs. 2A-F. P/CAF and ElA mediate antagonistic effects on cell cycle 
progression. HeLa cells (ATCC accession number CCL 2) were transfected by 
electroporation with 7 (xg of P/CAF-expression plasmid and/or 3 Mg of the full-length or 
30 the N-terminally deleted (A2-36) ElA 12S-expression plasmid as indicated in the figure 
These plasmids were constructed by subcloning FLAG-P/CAF and El A cDNAs into 
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pCX (34) and pcDNAI (Invitrogen), respectively. All samples, in addition, contained 1 
/ig of sorting plasmid (pCMV-IL2R) (3 1 ) and carrier plasmid (pCX) to normalize the 
total amount of DNA to 1 1 After transfection, cells were incubated in Dulbecco's 
modified Eagle's medium with 10% fetal bovine calf serum for 12 hours and 
5 subsequently labeled in medium containing 10 /iM bromo-deoxyuridine (BrdU) for 30 
min. Subsequently, the transfected subpopulation was purified by magnetic affinity cell 
sorting and nuclei were analyzed by dual parameter flow cytometry as described (32). 
Histograms show percentages of cells in Gl and S phases. Abscissa values represent 
fluorescence intensity of bound anti-BrdU antibodies in log scale. 

10 

Fig. 3. Histone acetyltransferase activity of P/CAF. Activity of hGCN5 (lanes 1 
and 4) and P/CAF (lanes 2 and 5) that acetylates free histones (lanes 1-3) or histones in 
the nucleosome core particle (35) (lanes 4-6) was measured as described (36). Each 
reaction contains 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol 
1 5 of the histone octamer or the nucleosome core particle and 1 0 pmol of [ 1 -^"^Cjacetyl- 
, Co A. Note that the histone octamer dissociates into dimers or tetramers under assay 
conditions. Acetylated histones were detected by autoradiography after separation by 
SDS-PAGE. The bands corresponding to acetylated histones H3 and H4 are indicated 
by arrows. 

20 

DETAILED DESCRIPTION OF THE INVENTION 

As used in the specification and in the claims, "a" can mean one or more, 
depending upon the context in which it is used. 

25 

P/CAF protein and fragments 

The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
30 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones. The P/CAF protein can also bind to the amino acid region of SEQ ED NO 3 
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(amino acid (aa) residues 1753 - 1966) of the cellular transcriptional factor, p300 (which 
has the complete amino acid sequence of SEQ ID N0:6 and the nucleotide sequence of 
SEQ ID NO: 12), and the amino acid region of SEQ ID NO:6 (amino acid residues 1805 
- 1854) of the cellular transcriptional factor, CBP (which has the complete amino acid 

5 sequence of SEQ ID N0:7 and the nucleotide sequence of SEQ ED NO: 13). The 
P/CAF protein can be defined by any one or more of the typically used parameters 
Examples of these parameters include, but are not limited to molecular weight 
(calculated or empirically determined), isoelectric focusing point, specific epitope(s), 
complete amino acid sequence, sequence of a specific region (e g., N-terminus) of the 

10 amino acid sequence and the like. 

For example. The P/CAF protein can consist of the amino acid sequence of SEQ 
ID NO: 1 or the P/CAF protein can comprise the amino acid sequence of SEQ ID NO:2 
which represents the carboxy tenninal end of the P/CAF protein and contains the histone 
1 5 acetyltransferase activity, or the amino acid sequence of SEQ ID NO: 4, which 

represents the amino terminal end of the P/CAF protein, containing the binding site for 
p300/CBP. Because the amino-terminal region is specific for P/CAF it can be used to 
define and identify P/CAF. 

20 As used herein, "purified" refers to a protein (polypeptide, peptide, etc ) that is 

sufficiently ft-ee of contaminants or cell components with which it normally occurs to 
distinguish it firom the contaminants or other components of its natural environment 
The purified protein need not be homogeneous, but must be sufficiently free of 
contaminants to be usefijl in a clinical or research setting, for example, in an assay for 

25 detecting antibodies to the protein. Greater levels of purity can be obtained using 
methods derived from well known protocols. Specific methods for purifying P/CAF 
proteins are known in the art. 

As will be appreciated by those skilled in the art, the invention also includes 
30 those P/CAF polypeptides having slight variations in amino acid sequence which yield 
polypeptides equivalent to the P/CAF protein defined herein. Such variations may arise 
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naturally as allelic variations (e^g,, due to genetic polymorphism) or may be produced by 
human intervention (e.g., by mutagenesis of cloned DN A sequences), such as induced 
point, deletion, insertion and substitution mutants. Minor changes in amino acid 
sequence are generally preferred, such as conservative amino acid replacements, small 
5 internal deletions or insertions, and additions or deletions at the ends of the molecules. 
Substitutions may be designed based on, for example, the model of DayhofF, et al. (37). 
These modifications can result in changes in the amino acid sequence, provide silent 
mutations, modify a restriction site, or provide other specific mutations. 

10 Modifications to any of the P/CAF proteins or fi"agments can be made, while 

preserving the specificity and activity (fijnction) of the native protein or fi-agment 
thereof As used herein, "native" describes a protein that occurs in nature. The 
modifications contemplated herein can be conservative amino acid substitutions, for 
example, the substitution of a basic amino acid for a different basic amino acid. 

1 5 Modifications can also include creation of fusion proteins with epitope tags or known 
recombinant proteins or genes encoding them created by subcloning into commercial or 
non-conmiercial vectors (e.g., polyhistidine tags, flag tags, myc tag, glutathione-S- 
transferase [GST] fusion protein, xylE fusion reporter construct). Furthermore, the 
modifications can be such as do not affect the function of the protein or the way the 

20 protein accomplishes that function (e.g., its secondary structure or the ultimate result of 
the protein's activity). These products are equivalent to the P/CAF protein. The means 
for determining the function, way and result parameters are well known. 

Having provided an example of a purified P/CAF protein, the invention also 
25 enables the purification of P/CAF homologs from other species and allelic variants from 
individuals within a species. For example, an antibody raised against the exemplary 
human P/CAF protein can be used routinely to screen preparations from different 
humans for allelic variants of the P/CAF protein that react with the P/CAF protein- 
specific antibody Similarly, an antibody raised against an epitope, for example, from a 
30 conserved amino acid region of the human P/CAF protein can be used to routinely 
screen for homologs of the P/CAF protein in other species, A P/CAF protein can be 
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routinely identified in and obtained from other species and from individuals within a 
species using the methods taught herein and others known in the art. For example, 
given the present sequence, the DNA encoding a conserved amino acid sequence can be 
used to probe genomic DNA or DNA libraries of an organism to predictably obtain the 
5 P/CAF gene for that organism. The gene can then be cloned and expressed as the 
P/CAF protein and purified according to any of a number of routine, predictable 
methods. An example of the routine protein purification methods available in the art can 
be found in Pei et al. (38). 

10 A purified polypeptide fragment of the P/CAF protein is also provided The 

term "fragment" as used herein regarding a P/CAF protein, means a molecule of at least 
five contiguous amino acids of P/CAF protein that has at least one function shared by 
P/CAF protein or a region thereof. These functions can include antigenicity, binding 
capacity, acetyltransferase activity and structural roles, among others. The P/CAF 

1 5 fragment can be specific for a recited source As used herein to describe an amino acid 
sequence (protein, polypeptide, peptide, etc.), "specific*" means that the amino acid 
sequence is not found identically in any other source. The determination of specificity is 
made routine by the availability of computerized amino acid sequence databases and 
sequence comparison programs, wherein an amino acid sequence of almost any length 

20 can be quickly and reliably checked for the existence of identical sequences. If an 
identical sequence is not found, the protein is "specific" for the recited source For 
example, a P/CAF fragment can be species- specific (e.g., found in the P/CAF protein of 
humans, but not of other species). 

25 A fragment of the P/CAF protein having histone acetyltransferase activity can 

consist of the amino acid sequence of SEQ ED NO:2. A fragment of the P/CAF protein 
which binds to the amino acid sequence of SEQ ED NO: 3 oh p300 and the amino acid 
sequence of SEQ ID NO .9 on CBP can consist of the amino acid sequence of SEQ ID 
NO: 4. To the extent that these fragments are specific for P/CAF, they can be used to 

30 identify and define P/CAF. 



1 
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An antigenic fragment of P/CAF protein is provided. An antigenic fragment has 
an amino acid sequence of at least about five consecutive amino acids of a P/CAF 
protein amino acid sequence and binds an antibody or elicits an immune response in an 
animal. An antigenic fragment can be selected by applying the routine technique of 
5 epitope mapping to P/CAF protein to determine the regions of the proteins that contain 
epitopes reactive with antibodies or are capable of eliciting an immune response in an 
animal. Once the epitope is selected, an antigenic polypeptide containing the epitope 
can be synthesized directly, or produced recombinantly by cloning nucleic acids 
encoding the antigenic polypeptide in an expression system, according to standard 
10 methods. 

Alternatively, an antigenic fragment of the antigen can be isolated from the 
whole P/CAF protein or a l2urger fragment of the P/CAF protein by chemical or 
mechanical disruption. Fragments can also be randomly chosen from a known P/CAF 
15 protein sequence and synthesized. The purified fragments thus obtained can be tested to 
determine their antigenicity and specificity by routine methods. 

Nucleic Acids Encoding P/CAF Protein 

An isolated nucleic acid that encodes a P/CAF protein is also provided As used 

20 herein, the term "isolated" means a nucleic acid separated or substantially free from at 
least some of the other components of the naturally occurring organism, for example, 
the cell structural components commonly found associated with nucleic acids in a 
cellular environment and/or other nucleic acids. The isolation of nucleic acids can 
therefore be accomplished by techniques such as cell lysis followed by phenol plus 

25 chloroform extraction, followed by ethanol precipitation of the nucleic acids (39). It is 
not contemplated that the isolated nucleic acids are necessarily totally free of all non- 
nucleic acid components or all other nucleic acids, but that the isolated nucleic acids are 
isolated to a degree of purification to be useful in clinical, diagnostic, experimental, or 
other procedures such as, for example, gel electrophoresis. Southern, Northern or dot 

30 blot hybridization, or polymerase chain reaction (PCR). 
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A skilled artisan in the field will readily appreciate that there are a multitude of 
procedures which may be used to isolate the nucleic acids prior to their use in other 
procedures. These include, but are not limited to, lysis of the cell followed by gel 
filtration or anion exchange chromatography, binding DNA to silica in the form of glass 
5 beads, filters or diatoms in the presence of high concentrations of chaotropic salts, or 
ethanol precipitation of the nucleic acids 

The nucleic acids of the present invention can include positive and negative 
strand RNA as well as DNA and can include genomic and subgenomic nucleic acids 

10 found in the naturally occurring organism. The nucleic acids contemplated by the 
present invention include double stranded and single stranded DNA of the genome, 
complementary positive stranded cRNA and mRNA, and complementary cDNA 
produced therefirom and any nucleic acid which can selectively or specifically hybridize 
to the isolated nucleic acids provided herein. Stringent conditions (further described 

15 below) are used to distinguish selectively or specifically hybridizing nucleic acids from 
non-selectively and non-specifically hybridizing nucleic acids. 

An isolated nucleic acid that encodes a P/CAF protein can be species-specific 
(i.e., does not encode the P/CAF protein of other species and does not occur in other 
20 species). Examples of the nucleic acids contemplated herein include the nucleic acid of 
SEQ ID NO: 10 as well as the nucleic acids that encode each of the P/CAF proteins or 
ft-agments thereof described herein. P/CAF proteins and protein fi-agments can be 
routinely obtained as described herein and their structure (sequence) determined by 
routine means including the methods as used herein 

25 

P/CAF protein-encoding nucleic acids can be isolated firom an organism in which 
they are normally found (e.g., humans), using any of the routine techniques. For 
example, a genomic DNA or cDNA library can be constructed and screened for the 
presence of the nucleic acid of interest using one of the present P/CAF protein-encoding 
30 nucleic acids as a probe. Methods of constructing and screening such libraries are well 
known in the art and kits for performing the construction and screening steps are 
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commercially available (for example, Stratagene Cloning Systems, La Jolla, CA). Once 
isolated, the nucleic acid can be directly cloned into an appropriate vector, or if 
necessary, be modified to facilitate the subsequent cloning steps. Such modification 
steps are routine, an example of which is the addition of oligonucleotide linkers, which 
5 contain restriction sites, to the termini of the nucleic acid (See, for example, ref 39). 

P/CAF protein-encoding nucleic acids can also be synthesized For example, a 
method of obtaining a DNA molecule encoding a specific P/CAF protein is to synthesize 
a recombinant DNA molecule which encodes the P/CAF protein. For example, nucleic 

10 acid synthesis procedures are routine in the art and oligonucleotides coding for a 

particular protein region are readily obtainable through automated DNA synthesis A 
nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
the resulting double-stranded molecule has either internal restriction sites or appropriate 

15 5' or 3' overhangs at the termini for cloning into an appropriate vector 

Oligonucleotides complementary to or identical with the P/CAF protein- 
encoding nucleic acid sequence can be synthesized as primers for amplification 
reactions, such as PCR, or as probes to detect P/C.AF protein encoding nucleic acids by 
20 various hybridization protocols (e.g.. Northern blot. Southern blot; dot blot, colony 
screening, etc.) 

Double-stranded molecules coding for relatively large proteins can readily be 
synthesized by first constructing several different double-stranded molecules that code 

25 for particular regions of the protein, followed by ligating these DNA molecules together 
For example, Cunningham, et al. (40), have constructed a synthetic gene encoding the 
human growth hormone by first constructing overlapping and complementary synthetic 
oligonucleotides and ligating these fi-agments together. See also, Ferretti, et al. (41), 
wherein synthesis of a 1057 base pair synthetic bovine rhodopsin gene fi-om synthetic 

30 oligonucleotides is disclosed. 
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By constRicting a P/CAP protein-encoding nucleic acid in this manner, one 
skilled in the an can readily obtain any particular P/CAF protein with modifications at 
any particular position or positions. See also, U.S. Patent No. 5,503,995 which 
describes an enzyme template reaction method of making synthetic genes Techniques 
5 such as this are routine in the art and are well documented. DNA encoding the P/CAF 
protein or P/CAF protein fragments can then be expressed in vivo or in vitro. 

The nucleic acid encoding the P/CAF protein can be any nucleic acid that 
functionally encodes the P/CAF protein. To functionally encode the protein (i.e., allow 

10 the nucleic acid to be expressed), the nucleic acid can include, but is not limited to, 
expression control sequences, such as an origin of replication, a promoter, regions 
upstream or downstream of the promoter, such as enhancers that may regulate the 
transcriptional activity of the promoter, appropriate restriction sites to facilitate cloning 
of inserts adjacent to the promoter, antibiotic resistance genes or other markers which 

15 can serve to select for cells containing the vector or the vector containing the insert, and 
necessary information processing sites, such as ribosome binding sites, RNA splice sites, 
polyadenylation sites and transcription termination sequences as well as any other 
sequence which may facilitate the expression of the inserted nucleic acid 

20 Preferred expression control sequences are promoters derived from 

metallothionine genes, actin genes, inmiunoglobulin genes, CMV', S V40, adenovirus, 
bovine papilloma vims, etc. A nucleic acid encoding a P/CAF protein can readily be 
determined based upon the genetic code for the amino acid sequence of the P/CAF 
protein and many nucleic acid sequences will encode a P/CAF protein. Modifications in 
25 the nucleic acid sequence encoding the P/CAF protein are also contemplated. 
Modifications that can be useful are modifications to the sequences controlling 
expression of the P/CAF protein to make production of P/CAF protein inducible or 
repressible as controlled by the appropriate inducer or repressor. Such means are 
standard in the art {see, e.g., ref 39). The nucleic acids can be generated by means 
30 standard in the art, such as by recombinant nucleic acid techniques, as exemplified in the 
examples herein, and by synthetic nucleic acid synthesis or in vitro enzymatic synthesis. 
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After a nucleic acid encoding a particular P/CAF protein of interest, or a region 
of that nucleic acid, is constructed, modified, or isolated, that nucleic acid can then be 
cloned into an appropriate vector, which can direct the in vivo or in vitro synthesis of 
that wild-type and/or modified P/CAF protein. The vector is contemplated to have the 
5 necessary fiinctional elements that direct and regulate transcription of the inserted 
nucleic acid, as described above. The vector containing the P/CAF nucleic acid or 
nucleic acid fragment can be in a host (e.g., cell or transgenic animal) for expressing the 
nucleic acid. The P/CAF protein or fragment thereof can thus be produced in a host 
system containing the expression vector and its functional activity as described herein 
10 can be demonstrated according to methods well known in the art. 

There are numerous E. coli {Escherichia coli) expression vectors known to one 
of ordinary skill in the art useful for the expression of proteins. Other microbial hosts 
suitable for use include bacilli, such as Bacillus subtilis, and other enterobacteria, such 

15 as Salmonella, Serratia, as well as various Pseudomonas species. These prokaryotic 
hosts can support expression vectors which will typically contain expression control 
sequences compatible with the host cell (e.g.. an origin of replication). In addition, any 
number of a variety of well-known promoters will be present, such as the lactose 
promoter system, a tryptophan (Trp) promoter system, a beta-lactamase promoter 

20 system, or a promoter system from phage lambda. The promoters will typically control 
expression, optionally with an operator sequence and have ribosome binding site 
sequences, for example, for initiating and completing transcription and translation. If 
necessary, an amino terminal methionine can be provided by insertion of a Met codon 5' 
and in-frame with the gene sequence. Also, the carboxy-terminal extension of the 

25 protein can be removed using standard oligonucleotide mutagenesis procedures 

Additionally, yeast expression can be used. There are several advantages to 
yeast expression systems. First, evidence exists that proteins produced in yeast secretion 
systems exhibit correct disulfide pairing. Second, post-translational glycosylation is 
30 efficiently carried out by yeast secretory systems. The Saccharomyces cerevisiae pre- 
pro-alpha-factor leader region (encoded by the MFa-l gene) is routinely used to direct 
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protein secretion from yeast (42). The leader region of pre-pro-alpha-factor contains a 
signal peptide and a pro-segment which includes a recognition sequence for a yeast 
protease encoded by the KEX2 gene. This enzyme cleaves the precursor protein on the 
carboxyl side of a Lys-Axg dipeptide cleavage-signal sequence. The polypeptide coding 

5 sequence can be fiised in-frame to the pre-pro-alpha-factor leader region. This construct 
is then put under the control of a strong transcription promoter, such as the alcohol 
dehydrogenase I promoter or a glycolytic promoter. The protein coding sequence is 
followed by a translation termination codon which is followed by transcription 
termination signals. Alternatively, the polypeptide encoding sequence of interest can be 

10 fused to a second protein coding sequence, such as Sj26 or p-galactosidase, used to 
facihtate purification of the resultant fusion protein by affinity chromatography. The 
insertion of protease cleavage sites to separate the components of the fusion protein is 
applicable to constructs used for expression in yeast. 

1 5 Efficient post-translational glycosylation and expression of recombinant proteins 

can also be achieved in Baculovirus expression systems in insect cells. 

Mammalian cells permit the expression of proteins in an environment that favors 
important post-translational modifications such as folding and cysteine pairing, addition 

20 of complex carbohydrate structures and secretion of active protein. Vectors usefiil for 
the expression of proteins in mammalian cells are characterized by insertion of the 
protein encoding sequence between a strong viral promoter and a polyadenylation 
signal The vectors can contain genes conferring either gentamicin or methotrexate 
resistance for use as selectable markers. For example, the antigen and immunoreactive 

25 fragment coding sequence can be introduced into a Chinese hamster ovary (CHO) cell 
line using a methotrexate resistance-encoding vector. Presence of the vector RNA in 
transformed cells can be confirmed by Northern blot analysis and production of a cDNA 
or opposite strand RNA corresponding to the protein encoding sequence can be 
confirmed by Southern and Northern blot analysis, respectively. A number of other 

30 suitable host cell lines capable of secreting intact proteins have been developed in the art 
and include the CHO cell lines, HeLa cells, myeloma cell lines, Jurkat cells, and the like. 
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Expression vectors for these cells can include expression control sequences, as described 
above. The vectors containing the nucleic acid sequences of interest can be transferred 
into the host cell by well-known methods, which vary depending on the type of cell host. 
For example, calcium chloride transfection is commonly utilized for prokaryotic cells, 
5 whereas calcium phosphate treatment or electroporation may be used for other cell 
hosts. 

Alternative vectors for the expression of protein in mammalian cells, similar to 
those developed for the expression of human gamma-interferon, tissue plasminogen 
10 activator, clotting Factor VIII, hepatitis B virus surface antigen, protease Nexin 1, and 
eosinophil major basic protein, can be employed. Further, the vector can include CMV 
promoter sequences and a polyadenylation signal available for expression of inserted 
nucleic acid in mammalian cells (such as C0S7). 

1 5 The nucleic acid sequences can be expressed in hosts after the sequences have 

been positioned to ensure the functioning of an expression control sequence. These 
expression vectors are typically replicable in the host organisms either as episomes or as 
an integral part of the host chromosomal DNA. Commonly, expression vectors can 
contain selection markers, e.g., tetracycline resistance or hygromycin resistance, to 

20 permit detection and/or selection of those cells transformed with the desired nucleic acid 
sequences (see, e.g., U.S. Patent 4,704,362). 

The nucleic acids produced as described above can also be expressed in a host 
which is a non-human animal to create a transgenic animal, containing, in a germ or 

25 somatic cell, a nucleic acid comprising the coding sequence for all or a portion of the 
P/CAF protein, as well as all of the other regulatory elements required for expression of 
the P/CAF protein-encoding sequence. The animal will express the P/CAF gene or 
portion thereof to produce the P/CAF protein or protein fragment and such expression 
can be detected by determination of a particular phenotype unique to the transgenic 

30 animal expressing the transferred nucleic acid. 
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The nucleic acid can be the nucleic acid of SEQ ED NO : 10, a nucleic acid having 
a nucleotide sequence which encodes the P/CAF protein, a nucleic acid having a 
nucleotide sequence which encodes the protein of SEQ ID NO: 1, as well as the nucleic 
acids that encode the proteins comprising the fragments of SEQ ID NOS:2 and 4. 

" 5 

The nucleic acids of the invention can contain substitutions or deletions which 
provide a particular phenotype of interest For example, various deletions or base 
substitutions can be introduced into the nucleic acid encoding the P/CAF protein for the 
purpose of studying the effects of these particular deletions or substitutions on the 

10 transcription modulation activity of the P/CAF protein. These effects can be monitored 
by observation of such characteristics as growth and development of the animal, the 
ability to develop tumors, survival rates and the like. The gene construct introduced 
into the animal cells to produce the transgenic animal can contain any of the regulatory 
elements described above to modulate expression of the foreign genes. As used herein, 

15 the term "phenotype" includes morphology, biochemical profiles, changes in tumor 
formation and other parameters that are affected by the presence of the P/CAF protein 

The transgenic animals of the invention can also be used in a method for 
determining the effectiveness of administering a nucleic acid encoding a functional 

20 P/CAF protein to a subject in need of a functional P/CAF protein. First/a nucleic acid 
encoding a nonfunctional P/CAF protein can be introduced into the animal's cells and 
expressed to yield a characteristic phenotype. Then, using standard gene therapy 
techniques, a nucleic acid encoding a functional P/CAF protein can be introduced into 
the animal's cells and the effects on the animars phenotypic characteristics can be 

25 determined. 

Having provided and taught how to obtain a nucleic acid that encodes a P/CAF 
protein, an isolated nucleic acid that encodes a fragment of P/CAF protein is also 
provided. The nucleic acid encoding the fragment can be obtained using any of the 
30 methods applicable to the nucleic acid encoding the entire P/CAF protein. The nucleic 
acid fragment can encode a species-specific P/CAF protein fragment (e.g., found in the 
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P/CAP protein of humans, but not in the P/CAF proteins of other species). Nucleic 
acids encoding species-specific fragments of P/CAF protein are themselves species- 
specific or allele-specific fragments of the P/CAF gene 

5 Examples of fragments of a nucleic acid encoding a fragment of the P/CAF 

protein can include the nucleic acid sequences which encode the amino acid sequences 
of the fragments of SEQ ID NOS:2 or 4. The same routine computer analyses used to 
select these examples of fragments can be routinely used to obtain others. Fragments of 
P/CAF-encoding nucleic acids can be primers for PGR or probes, which can be species- 

10 specific, gene-specific or allele-specific. P/CAF-encoding nucleic acid fragments can 
encode antigenic or immunogenic fi'agments of P/CAF protein that can be used in 
therapeutic assays or screening protocols. P/CAF gene fragments can encode fragments 
of P/CAF protein having histone acetylase activity and/or p300/CBP binding activity as 
described above, as well as other uses that may become apparent. 

15 . 

An isolated nucleic acid of at least ten nucleotides that selectively hybridizes with 
the nucleic acid of SEQ ID NO; 10 under selected conditions is provided. For example, 
the conditions can be PCR amplification conditions and the hybridizing nucleic acid can 
be a primer consisting of a specific fragment of the reference sequence or a nearly 
20 identical nucleic acid that hybridizes only to the exemplified P/CAF-encoding nucleic 
acid or allelic variants thereof 

The invention provides an isolated nucleic acid that selectively hybridizes with 
the P/CAF-encoding nucleic acid sequence of SEQ ID NO: 10 under stringent 

25 conditions. The hybridizing nucleic acid can be a probe that hybridizes only to the 

exemplified P/CAF-encoding nucleic acid sequence. Thus, the hybridizing nucleic acid 
can be a naturally occurring species-specific allelic variant of the exemplified P/CAF 
gene. The hybridizing nucleic acid can also include insubstantial base substitutions that 
do not prevent hybridization under the stated stringent conditions or aflFect either the 

30 function of the encoded protein, the way the protein accomplishes that ftjnction (e.g., its 
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secondary structure) or the ultimate result of the protein's activity The means for 
determining these parameters are well known. 

As used herein to describe nucleic acids, the term "selectively hybridizes" 
5 excludes the occasional randomly hybridizing nucleic acids as well as nucleic acids that 
encode other known homologs of the P/CAF protein. The selectively hybridizing 
nucleic acids of the invention can have at least 70%, 73%, 78%, 80%, 85%, 88%, 90%, 
91%, 92%. 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% complementarity with the 
segment and strand of the sequence to which it hybridizes. This list is not intended to 
10 exclude percent complementarity values between these values The nucleic acids can be 
at least 10, 15. 16, 17, 18. 20, 21. 23, 24, 25, 30. 35, 40, 50, 100, 150. 200, 300, 500, 
550. 750, 900, 950, or 1000 nucleotides in length or any intervening length, depending 
on whether the nucleic acid is to be used as a primer, probe or for protein expression. 
The hybridizing nucleic acid can comprise a region of at least ten nucleotides (up to full 
1 5 length) that is completely complementary to a unique region of the nucleic acid to which 
it hybridizes. 

The nucleic acid can be an alternative coding sequence for the P/CAF protein, or 
can be used as a probe or primer for detecting the presence of or obtaining the P/CAF 
20 protein. If used as primers, the invention provides compositions including at least two 
nucleic acids which selectively hybridize with different regions of the nucleic acid so as 
to amplify a desired region Depending on the length of the probe or primer, it can 
range between 70% complementary bases and full complementarity and still hybridize 
under stringent conditions. 

25 

For example, for the purpose of obtaining or determining the presence of a 
nucleic acid encoding the P/CAF protein, the degree of complementarity between the 
hybridizing nucleic acid (probe or primer) and the sequence to which it hybridizes 
(P/CAF DNA in a sample) should be at least enough to exclude hybridization with a 
30 nucleic acid from another species. The invention provides examples of these nucleic 

acids of P/CAF, so that the degree of complementarity required to distinguish selectively 
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hybridizing from nonselectively hybridizing nucleic acids under stringent conditions can 
be cleariy determined for each nucleic acid. It should also be clear that the hybridizing 
nucleic acids of the invention will not hybridize with nucleic acids encoding unrelated 
proteins (hybridization is selective) under stringent conditions. 

5 

"Stringent conditions" refers to the washing conditions used in a hybridization 
protocol. In general, the washing conditions should be a combination of temperature 
and salt concentration chosen so that the denaturation temperature is approximately 5- 
20 °C below the calculated T„ of the nucleic acid hybrid under study The temperature 

10 and salt conditions are readily determined empirically in preliminary experiments in 
which samples of reference DNA immobilized on filters are hybridized to the probe or 
protein encoding nucleic acid of interest and then washed under conditions of different 
stringencies. For example, the nucleic acid sequence of SEQ ID NO: 10 was used as a 
specific radiolabeled probe for the detection of messenger RNA transcribed from the 

1 5 P/CAJF gene by performing hybridizations under stringent conditions. The T„ of such an 
oligonucleotide can be estimated by allowing 2°C for each A or T nucleotide, and 4°C 
for each G or C. For example, an 18 nucleotide probe of 50% G+C would, therefore, 
have an approximate T^ of 54°C. 

20 The invention provides an isolated nucleic acid that selectively hybridizes with 

the P/CAF gene shovm in the sequence set forth as SEQ ID NO: 10 under stringent 
conditions. The invention fijrther provides an isolated nucleic acid complementary to 
the nucleotide sequence set forth in SEQ ID NO: 10. 

25 Antibodies to the P/CAF protein 

A purified antibody and an antiseaim containing polyclonal antibodies that 
specifically bind the P/CAF protein or antigenic fragment are also provided. The term 
"bind" means the well understood antigen/antibody binding as well as other nonrandom 
association with an antigen. "Specifically bind" as used herein describes an antibody or 

30 other ligand that does not cross react substantially with any antigen other than the one 
specified, in this case, an antigen of the P/CAF protein. Antibodies can be made as 
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described in Harlow and Lane (33). Briefly, purified P/CAF protein or an antigenic 
fragment thereof can be injected into an animal in an amount and in intervals sufficient to 
elicit a humoral immune response. Serum polyclonal antibodies can be purified directly, 
or spleen cells fi-om the animal can be fused with an immortal cell line and screened for 
5 monoclonal antibody secretion, according to procedures well known in the art. Purified 
monospecific polyclonal antibodies that specifically bind the P/CAF antigen are also 
within the scope of the present invention. The antibodies of the present invention can 
bind the protein of claim 1, the protein of claim 2, the protein of claim 3 and/or the 
protein of claim 4, as well as any other proteins of the present invention. 

10 

A ligand that specifically binds the antigen is also contemplated. The ligand can 
be a fragment of an antibody, such as , for example, an Fab fragment which retains 
P/CAF binding activity, or a smaller molecule designed to bind an epitope of the P/CAF 
antigen. The antibody or ligand can be bound to a substrate or labeled with a detectable 
15 moiety or both bound and labeled. The detectable moieties contemplated within the 

compositions of the present invention include those listed above in the description of the 
diagnostic methods, including fluorescent, enzymatic and radioactive markers 

The antibody can be bound to a solid support substrate or conjugated with a 
20 detectable moiety or therapeutic compound or both bound and conjugated. Such 
conjugation techniques are well known in the art. For example, conjugation of 
fluorescent, radioactive or enzymatic moieties can be performed as described in the art 
(33,43). The detectable moieties contemplated in the present invention can include 
fluorescent, radioactive and enzymatic markers and the like. Therapeutic drugs 
25 contemplated with the present invention can include cytotoxic moieties such as ricin A 
chain, diphtheria toxin, pseudomonas exotoxin and other chemotherapeutic compounds. 

It is well understood by one of skill in the art that all of the above discussion 
regarding antibodies to P/CAF can also be applied with regard to production, 
30 characterization and use of antibodies which bind the p300/CBP protein or any of the 
DNA-binding transcription factors of this invention. 
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Measuring the P/CAF protein in a sample 

The present invention also provides a method for determining the presence and 
thus the amount of P/CAF protein in a biological sample As used herein, a biological 
sample includes any tissue or cell which would contain the P/CAF protein. Examples of 
5 cells include tissues taken from surgical biopsies, isolated from a body fluid or prepared 
in an in vitro tissue culture environment 

One example of determining the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 

10 sequence of SEQ ID NO: 3 under conditions whereby a P/CAF/p300 complex can be 
formed; and determining the amount of the P/C AF/p300 complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/p300 complex can be accomplished through techniques standard in the art. 
For example, the complex may be precipitated out of a solution and detected by the 

1 5 addition of a detectable moiety conjugated to the p300 protein or by the detection of an 
antibody which binds p300 or the P/CAF protein, as taught in the Examples herein. 
Antibodies which bind p300 or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of 
P/CAF/p300 complexes by the detection of the binding of antibodies reactive with p300 

20 or the P/CAF protein can be accomplished using various immunoassays as are available 
in the art, as described below. 

Alternatively, determination of the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 

25 sequence of SEQ ED NO: 9 under conditions whereby a P/CAF/CBP complex can be 
formed; and determining the amount of the P/CAF/CBP complex, the amount of the 
complex indicating the amount of P/C AF in the sample. Determination of the amount 
of P/CAF/CBP complex can be accomplished through techniques standard in the art. 
For example, the complex may be precipitated out of a solution and detected by the 

30 addition of a detectable moiety conjugated to the CBP protein or by the detection of an 
antibody which binds either CBP or the P/CAF protein, as taught in the Examples 
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herein. Antibodies which bind CBP or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of P/CAF/CBP 
complexes by the detection of the binding of antibodies reactive with CBP or the P/CAF 
protein can be accomplished using various immunoassays as are available in the art, as 
5 described below. 

Another example of determining the amount of P/CAF in a biological sample 
comprises contacting the biological sample with an antibody which specifically binds 
P/CAF under conditions whereby a P/CAF/ antibody complex can be formed and 
10 determining the amount of the P/CAF/antibody complex, the amount of the complex 
indicating the amount of P/CAF in the sample. Antibodies which bind P/CAF can be 
either monoclonal or polyclonal antibodies and can be obtained as described herein 
Determination of P/CAF/antibody complexes can be accomplished using various 
immunoassays as are available in the art, as described below. 

15 

Immunoassays such as immunofluorescence assays, radioimmunoassays (RIA), 
immunoblotting and enzyme linked immunosorbent assays (ELISA) can be readily 
adapted for detection and measurement of P/CAF in a biological sample Both 
polyclonal and monoclonal antibodies can be used in the assays. Available 
20 immunoassays are well known in the art and are extensively described in the patent 
scientific literature. See, for example, U.S Patent Nos. 3,791,932, 3,839,153; 
3,850,752; 3,850.578, 3,853,987, 3,867,517, 3,879,262; 3,901,654; 3,935,074, 
3,984,533; 3,996,345, 4,034,074; and 4,098,876 

25 Screening assays for P/CAF 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of P/CAF comprising contacting a 
system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 
30 amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 
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amount of histone acetylation by P/CAP in the absence of the substance, a decreased 
amount of histone acetylation by P/CAF in the presence of the substance indicating a 
substance that can inhibit the histone acetyltransferase activity of P/CAF. The 
acetylation of histones by P/CAF can be determined in a system including, for example, 
5 either core histones (histones H2A, H2B, H3 and H4) or the nucleosome core particles 
(146 base pairs of DNA wrapped around the octamer of core histones) as substrates, the 
P/CAF protein and radiolabeled acetyl-CoA (e.g., [I-^'^CJacetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 
as described herein in the Examples. Thus, the compound to be tested for the ability to 
10 inhibit the histone acetyltransferase activity of P/CAF can be added to this system and 
assayed for inhibiting ability. 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the transcription modulating activity of P/CAF, comprising contacting a 

15 system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 
amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 
amount of histone acetylation by P/CAF in the absence of the substance, a decreased 

20 amount of histone acetylation by P/CAF in the presence of the substance indicating a 

substance that can inhibit the transcription modulating activity and cell cycle progression 
suppressing activity of P/CAF. The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2 A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 

25 octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 
(e.g., [l-^^C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples 
Thus, the compound to be tested for the ability to inhibit the transcription modulating 
activity of P/CAF by interfering with the histone acetyltransferase activity of P/CAF can 

30 be added to this system and assayed for inhibiting ability. 
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Also provided in the present invention is a bioassay for screening substances for 



the abihty to inhibit the binding of p300 to P/CAF, composing contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 and P/CAF can occur; determining the amount 

5 of p300 binding to P/CAF in the presence of the substance, and comparing the amount 
of p300 binding to P/CAF in the presence of the substance with the amount of p300 
binding to P/CAF in the absence of the substance, a decreased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can inhibit the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 

10 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the p300 protein comprising the amino acid sequence of SEQ ED N0:3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 



Additionally provided in the present invention is a bioassay for screening 
substances for the ability to inliibit the binding of CBP to P/CAF, comprising contacting 
a system in which the binding of CBP to P/CAF can be determined, with the substance 
under conditions whereby the binding of CBP to P/CAF can occur; determining the 

20 amount of CBP binding to P/CAF in the presence of the substance, and comparing the 
amount of CBP binding to P/CAF in the presence of the substance with the amount of 
CBP binding to P/CAF in the absence of the substance, a decreased amount of CBP 
binding to P/CAF in the presence of the substance indicating a substance that can inhibit 
the binding of CBP to P/CAF. The binding of CBP to P/CAF can be determined in a 

25 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the CBP protein comprising the amino acid sequence of SEQ ED NO 9 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both CBP and P/CAF. Determination of the binding of CBP to P/CAF can be 
carried out as taught herein. 



15 



30 
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The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of P/CAF comprising 
contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance, determining the amount of histone acetylation by P/CAF in the presence of 
5 the substance, and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 
presence of the substance indicating a substance that can stimulate the histone 
acetyltransferase activity of P/CAF. The acetylation of histones by P/CAF can be 

10 determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl- 
CoA (e.g., [l-^^C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 

1 5 Thus, the compound to be tested for the ability to stimulate the histone acetyltransferase 
activity of P/CAF can be added to this system and assayed for stimulating ability 

The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the transcription modulating activity of P/CAF comprising 

20 contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
the substance; and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 

25 presence of the substance indicating a substance that can stimulate the transcription 

modulating activity of P/CAF. The acetylation of histones by P/CAF can be detenmined 
in a system including, for example, either core histones (histones H2A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 
octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 

30 (e.g., [l-*^C]acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
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Thus, the compound to be tested for the ability to stimulate the transcription modulating 
activity of P/CAF by increasing the histone acetyltransferase activity of P/CAF can be 
added to this system and assayed for stimulating ability. 

5 The present invention fijrther provides a bioassay for screening substances for 

the ability to stimulate binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 to P/CAF can occur; determining the amount of 
p300 binding to P/CAF in the presence of the substance; and comparing the amount of 

10 p300 binding to P/CAF in the presence of the substance with the amount of p300 

binding to P/CAF in the absence of the substance, an increased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can stimulate the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
system, for example, which can include a cell free reaction mixture comprising a 

15 fragment of the p300 protein comprising the amino acid sequence of SEQ ID NO: 3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

20 Additionally provided in the present invention is a bioassay for screening 

substances for the ability to stimulate the binding of CBP to P/CAF, comprising 
contacting a system in which the binding of CBP to P/CAF can be determined, with the 
substance under conditions whereby the binding of CBP to P/CAF can occur; 
determining the amount of CBP binding to P/CAF in the presence of the substance, and 

25 comparing the amount of CBP binding to P/CAF in the presence of the substance with 
the amount of CBP binding to P/CAF in the absence of the substance, an increased 
amount of CBP binding to P/CAF in the presence of the substance indicating a 
substance that can stimulate the binding of CBP to P/CAF. The binding of CBP to 
P/CAF can be determined in a system, for example, which can include a cell free 

30 reaction mixture comprising a fragment of the CBP protein comprising the anaino acid 
sequence of SEQ ID NO:9 and P/CAF. Alternatively, the system can comprise a cell 
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extract produced from cells producing both CBP and P/CAF. Determination of the 
binding of CBP to P/CAF can be carried out as taught herein. 

Transcription modulating activity of P/CAF 

5 The present invention contemplates a method for inhibiting the transcription 

modulating activity of P/CAF in a subject, comprising administering to the subject a 
transcription modulating activity inhibiting amount of a substance in a pharmaceutically 
acceptable carrier. For example, the substance can be identified according to the 
protocols provided herein as one that can inhibit the transcription modulating activity of 

10 P/CAF by preventing the binding of P/CAF to p300/CBP or by inhibiting the histone 
acetyitransferase activity of P/CAF as well as by any other inhibitory mechanism as 
identified by the protocols provided herein. Inhibition of the transcription modulating 
activity of P/CAF in a subject is desirable, for example, to inhibit HIV TAT-mediated 
transcription and therefore, the method of the present invention can be used to treat 

1 5 PHV-infected subjects. 

The substance can be in a pharmaceutically acceptable carrier. By 
"pharmaceutically acceptable" is meant a material that is not biologically or othenvise 
undesirable, i.e., the material may be administered to a subject, along with the substance, 
20 without causing any undesirable biological effects or interacting in a deleterious manner 
with any of the other components of the pharmaceutical composition in which it is 
contained. The carrier would naturally be selected to minimize any degradation of the 
active ingredient and to minimize any adverse side effects in the subject. 

25 The transcription modulating activity and/or histone acetyitransferase activity of 

P/CAF can be inhibited in a subject by administering to the subject a substance which 
binds p300/CBP at the P/CAF binding site or a substance which binds the P/CAF 
protein at the p300/CBP binding site, the ultimate result being that P/CAF and 
p300/CBP do not bind with one another and P/CAF cannot exert its transcription 

30 modulating and/or histone acetyitransferase effect. The substance can be a protein, such 
as an antibody which binds the P/CAF protein binding site at or near the p300/CBP 
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binding site, thereby preventing its binding or an antibody which binds the p300/CBP 
protein at or near the P/CAF binding site, thereby preventing its binding. The substance 
can also bind the histone acetyltransferase site on P/CAF or at the acetylation site on the 
histone, thereby preventing acetylation by P/CAF 

5 

The substance which binds p300/CBP, the P/CAF protein or the histone and has 
the net effect of inhibiting the transcription modulating effect and or histone 
acetyltransferase activity of P/CAF in the cell can be delivered to a cell in the subject by 
mechanisms well known in the art. 

10 

Alternatively, a nucleic acid encoding a protein which binds either to p300/CBP 
or the P/CAiF protein and has the net effect of inhibiting the transcription modulating 
effect and/or histone acetyltransferase activity of P/CAF in the cell can be delivered to a 
cell in the subject by gene transduction mechanisms well known in the art. For example, 
1 5 nucleic acid can be introduced by liposomes as well as via retroviral or adeno-associated 
viral vectors, as described below. 



The substance which inhibits the transcription modulating effect and/or histone 
acetyltransferase activity of P/CAF can be an antisense RNA or an antisense DNA which 
20 binds the RNA or DNA of P/CAF. thereby preventing translation or transcription of the 
RNA or DNA encoding P/CAF and having the net effect of inhibiting the transcription 
modulating effect and/or histone acetyltransferase activity of P/CAF by inhibiting P/CAF 
production. The antisense RNA of the present invention can be generated fi-om the 
nucleic acid of SEQ ID NO: 1 4 (human) or SEQ ID NO: 1 5 (mouse). Furthermore, the 
25 antisense DNA can be a phosphorothioate oligodeoxyribonucleotide having the 

nucleotide sequence of SEQ ID NO: 16 (human) or of SEQ ID NO: 17 (mouse) The 
mouse antisense RNA can be used to inhibit the activity of mouse P/CAF, having the 
nucleotide sequence of SEQ ID NO: 1 8 and the amino acid sequence of SEQ ID NO: 8 
The present invention also contemplates an antisense nucleic acid sequence which can 
30 bind the DNA or RNA of any of the transcription factors or other proteins now known 
or later identified to bind P/CAF, thereby inhibiting expression of the gene products of 
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these proteins and having the net effect of inhibiting the transcription modulating effect 
and/or histone acetyltransferase activity of P/CAF. 

The antisense nucleic acid can comprise a typical nucleic acid, but the antisense 
5 nucleic acid can also be a modified nucleic acid or a derivative of a nucleic acid such as a 
phosphorothioate analogue of a nucleic acid. The composition can comprise, for 
example, an antisense RNA that specifically binds an RNA encoded by the gene 
encoding the serum protein. Antisense RNAs can be synthesized and used by standard 
methods (62). 

10 

Antisense RNA can inhibit gene expression by forming an RNA/RNA duplex 
between the antisense RNA and the RNA transcribed fi*om the target gene. The precise 
mechanism by which this duplex formation decreases the production of the protein 
encoded by the endogenous gene probably involves binding of complementary regions 

1 5 of the normal sense mRNA and the antisense RNA strand with duplex formation in a 
manner that blocks RNA processing and translation. Alternative mechanisms include 
the formation of a triplex between the antisense RNA and duplex DNA or the formation 
of an DNA-RNA duplex with subsequent degradation of DNA-RNA hybrids by RNAse 
H. Furthermore, an antigene effect can result from certain DNA-based oligonucleotides 

20 via triple-helix formation between the oligomer and double-stranded DNA which results 
in the repression of gene transcription. Regardless of the specific molecular mechanism, 
the present invention results in inhibition of expression of the P/CAF gene by the 
introduced and replicated DNA resulting in inhibition of the transcription modulating 
and/or histone acetyltransferase activity of P/CAF. by a reduction in the expression of 

25 the nucleic acid to which the antisense nucleic acid is hybridized, and therefore a 
reduction of the gene product from the targeted gene. 

, The antisense nucleic acid may be obtained by any number of techniques known 
to one skilled in the art. One method of constructing an antisense nucleic acid is to 
30 synthesize a recombinant antisense DNA molecule. For example, oligonucleotide 

synthesis procedures are routine in the art and oligonucleotides coding for a particular 
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protein or regulatory region are readily obtainable through automated DNA synthesis 
A nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
the resuhing double- stranded molecule has either internal restriction sites or appropriate 

5 5' or 3' overhangs at the termini for cloning into an appropriate vector. Double-stranded 
molecules coding for relatively large proteins or regulatory regions can be synthesized 
by first constructing several different double-stranded molecules that code for particular 
regions of the protein or regulatory region, followed by ligating these DNA molecules 
together. Once the appropriate DNA molecule is synthesized, this DNA can be cloned 

10 downstream of a promoter in an antisense orientation. Techniques such as this are 
routine in the art and are well documented. 



An example of another method of obtaining an antisense nucleic acid is to isolate 
that nucleic acid from the organism in which it is found and clone it in an antisense 

1 5 orientation. For example, a DNA or cDNA library can be constructed and screened for 
the presence of the nucleic acid of interest. Methods of constructing and screening such 
libraries are well known in the art and kits for performing the construction and screening 
steps are commercially available (for example, Stratagene Cloning Systems, La JoUa, 
CA). Once isolated, the nucleic acid can be directly cloned into an appropriate vector in 

20 an antisense orientation, or if necessary, be modified to facilitate the subsequent cloning 
steps. Such modification steps are routine, an example of which is the addition of 
oligonucleotide linkers which contain restriction sites to the termini of the nucleic acid 
General methods are set forth in Sambrook et al. (39). 

25 The DNA that is introduced into the cell is in an expression orientation that is 

antisense to a corresponding endogenous DNA or RNA of the cells. For example, 
where an endogenous DNA comprises a gene which encodes for a particular protein, the 
introduced DNA is in an expression orientation opposite the expression of the 
endogenous DNA, that is the DNA operatively linked to a promoter is in an antisense 

30 expression orientation relative to the corresponding endogenous gene. The introduced 
DNA may be homologous to the entire transcribed gene or homologous to only part of 
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the transcribed gene. Alternatively, the sequence of the introduced DNA may be 
divergent to that of the endogenous DNA but only divergent to the extent that 
hybridization of the nucleic acids occurs, thereby preventing transcription One skilled 
in the art can determine the maximum extent of this divergence by routine screening of 
5 antisense DNAs corresponding to an endogenous DNA of the cell. In this manner, one 
skilled in the art can readily determine which fragments, or alternatively the extent of 
homology of the fragments or the entire gene that is necessary to inhibit gene 
expression. 

10 The antisense nucleic acids of the present invention can be made according to 

protocols standard in the art, as well as described in the Examples provided herein. The 
antisense nucleic acids can be administered to a subject according to the gene 
transduction protocols standard in the art, as described below 

15 The present invention also contemplates a method for stimulating the 

transcription modulating activity and/or histone acetyltransferase activity of P/CAF in a 
subject comprising administering to the subject a substance, in a pharmaceutically 
acceptable carrier, determined according to the methods taught herein, to have a 
-stimulatory affect on the transcription modulating and/or histone acetyltransferase 

20 activity of P/CAF. The substance can be one which has been identified, according to the 
protocols provided herein, to stimulate histone acetyltransferase activity in P/CAF or 
promote binding of P/CAF to p300/CBP. The stimulation of the transcription 
modulation activity and/or histone acetyltransferase activity of P/CAF in a subject is 
desirable, for example, to activate tumor suppressor p53 (which promotes apoptosis) or 

25 to activate the muscle differentiation factor, MyoD. Thus, the method of the present 
invention can be employed to treat cancer and to promote muscle differentiation in 
conditions where muscle differentiation is desired. The substance can be delivered to a 
cell in the subject by mechanisms well known in the art. 

30 Further contemplated in the present invention is a method for promoting binding 

of P/CAF to p300/CBP in a subject, comprising administering to the subject a substance 
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identified by the methods provided herein to promote binding of P/CAF to either p300 
orCBP 

Additionally, a nucleic acid encoding a protein which stimulates the transcription 
5 modulating activity and/or histone acetyltransferase activity of P/CAF can be delivered 
to a cell in the subject by gene transduction mechanisms, as described below. 

Also provided in the present invention is a method of inhibiting the cell cycle 
progression inducing effect of an oncoprotein which binds p300/CBP in a subject 

10 comprising transducing the cells of the subject with a vector comprising a nucleic acid 
encoding the P/CAF protein; inducing expression of the nucleic acid in the cell to 
produce the P/CAF in an amount which will allow the P/CAF gene product to replace 
the oncoprotein bound to p300/CBP, whereby the replacement of the oncoprotein 
bound to p300/CBP by the P/CAF gene product inhibits the cell cycle progression 

15 inducing effect of the oncoprotein. The oncoprotein which binds p300/CBP in the cell 
can be^the adenovirus El A oncoprotein. 

A method for providing a functional P/CAF protein to a subject in need of the 
filhctional P/CAF protein is also provided, comprising transducing the cells of the 

20 subject with a vector comprising a nucleic acid encoding the P/CAF protein and 

inducing expression of the nucleic acid to produce the functional P/CAF protein in the 
cell, thereby providing the functional P/CAF protein to the subject. The transduction of 
the vector nucleic acid into the subject's cells can be carried out according to standard 
gene therapy protocols well known in the art (see, for example, U.S. Patent No. 

25 5,339,346). 

Screening assays for p300/CBP 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of p300/CBP comprising 
30 conucting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance under conditions whereby histone acetylation by p300/CBP can occur; 
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determining the amount of histone acetylation by p300/CBP in the presence of the 
substance; and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased amount of histone acetylation by p300/CBP in the 
5 presence of the substance indicating a substance that can inhibit the histone 

acetyltransferase activity of p300/CBP. The acetylation of histones by p300/CBP can be 
determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles ( 1 46 base pairs of DN A wrapped around 
the octamer of core histones) as substrates, the P300/CBP protein and radiolabeled 

10 acetyl-CoA (e.g., [l-*'*C]acetyl CoA). The presence of acetylated histones can be 

detected by autoradiography after separation by SDS-PAGE as described herein in the 
Examples. Thus, the compound to be tested for the ability to inhibit the histone 
acetyltransferase activity of p300/CBP can be added to this system and assayed for 
acetyltransferase inhibiting ability. 

15 ■ 

Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of a transcriptional factor to p300/CBP, comprising 
contacting a system in which the binding of a transcriptional factor to p300/CBP can be 
determined, with the substance under conditions whereby the binding of the 

20 transcriptional factor and p300/CBP can occur; determining the amount of 

transcriptional factor binding to p300/CBP in the presence of the substance, and 
comparing the amount of transcriptional factor binding to p300/CBP in the presence of 
the substance with the amount of transcriptional factor binding to p300/CBP in the 
absence of the substance, a decreased amount of transcriptional factor binding to 

25 p300/CBP in the presence of the substance indicating a substance that can inhibit the 
binding of a transcriptional factor to p300/CBP. The binding of a transcriptional factor 
to p300/CBP can be determined in a system, for example, which can include a cell free 
reaction mixture comprising a transcriptional factor which binds p300/CBP and 
p300/CBP. Alternatively, the system can comprise a cell extract produced from cells 

30 producing both a transcriptional factor which binds p300/CBP and p300/CBP. The 
transcriptional factor which binds p300/CBP can be selected from, but is not hmited to 
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the group consisting of nuclear hormone receptors, CREB, c-Jun/v-Jun, c-Myb/v-Myb, 
YYl, Sap- l a, c-Fos, MyoD and SRC-1, as well as any other transcriptional factor now 
known or later identified to bind p300/CBP. The screening assay of the present 
invention can also be used to identify substances which inhibit the binding of p300/CBP 
5 to other components to which it is known to bind, for example, P/CAF, pp90rsj;, TFIEB, 
El A, SV40 large T antigen, as well as any other substances now known or later 
identified to bind p300/CBP. Determination of the binding of a transcriptional factor or 
other substance to p300/CBP can be carried out as taught in the Examples herein as well 
as by protocols described in the literature. 



The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of p300/CBP comprising 
contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance, determining the amount of histone acetylation by p300/CBP in the 
1 5 presence of the substance; and comparing the amount of histone acetylation by 

p300/CBP in the presence of the substance with the amount of histone acetylation by 
p300/CBP in the absence of the substance, an increased amount of histone acetylation 
by p300/CBP in the presence of the substance indicating a substance that can stimulate 
the histone acetyltransferase activity of p300/CBP. The acetylation of histones by 
20 p306/CBP can be determined in. a system including, for example, either core histones 
(histones H2A, H2B, H3 and H4) or the nucleosome core particles (146 base pairs of 
DNA wrapped around the octamer of core histones) as substrates, the p300/CBP 
protein and radiolabeled acetyl-CoA (e.g., [l-^'C]acetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 
25 as described herein in the Examples. Thus, the compound to be tested for the ability to 
stimulate the histone acetyltransferase activity of p300/CBP can be added to this system 
and assayed for stimulating ability. 



10 



30 



The present invention further provides a bioassay for screening substances for 
the ability'to stimulate binding of a component, which binds p300/CBP, to p300/CBP, 
comprising contacting a system in which the binding of the component to p300/CBP can 
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be determined, with the substance under conditions whereby the binding of the 
component to p300/CBP can occur; determining the amount of component binding to 
p300/CBP in the presence of the substance; and comparing the amount of component 
binding to p300/CBP in the presence of the substance with the amount of component 
5 binding to p300/CBP in the absence of the substance, an increased amount of 

component binding to p300/CBP in the presence of the substance indicating a substance 
that can stimulate the binding of the component to p300/CBP. The binding of the 
component to p300/CBP can be determined in a system, for example, which can include 
a cell free reaction mixture comprising the component and p300/CBP. Alternatively, the 

10 system can comprise a cell extract produced from cells producing both the component 
and p300/CBP. The component which binds p300/CBP can be any of the transcriptional 
factors or other proteins which are known or are identified in the future to bind 
p300/CBP, as set forth above. Determination of the binding of the component to 
p300/CBP can be carried out as taught in the Examples provided herein and according 

1 5 to protocols available in the literature. 

Histone acetyltransferase activity of p300/CBP 

A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject is provided in the present invention, comprising administering to the subject a 

20 histone acetyltransferase activity inhibiting amount of a substance in a pharmaceutically 
acceptable carrier The mechanism of the inhibitory action of the substance can be the 
iiihibition of the binding of a DNA-binding transcription factor, such as, for example, a 
nuclear. hormone receptor, CREB, c-Jun/v-Jun, c-Myb/v-Myb, YYl, Sap- la, c-Fos, 
MyoD or SRC-1 , to p300/CBP. 

25 . 

The histone acetyltransferase activity of p300/CBP can be inhibited in a subject 
by administering to the subject a substance which binds p300/CBP at the transcription 
factor binding site or a substance which binds the transcription factor protein at the 
p300/CBP binding site, the ultimate result being that the transcription factor and 
30 p300/CBP do not bind v^th one another and p300/CBP cannot acetylate hi stones 
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The substance which binds either to the transcription factor or the p300/CBP 
protein and has the net effect of inhibiting the histone acetyltransferase activity of 
p300/CBP in the cell can be identified according to the screening methods provided 
herein and delivered to a cell in the subject by mechanisms well known in the art. The 

5 substance can be a protein, such as an antibody which binds the p300/CBP protein 
binding site at or near the DNA-binding transcription factor binding site, thereby 
preventing its binding or an antibody which binds the DNA-binding transcription factor 
at or near the p300/CBP binding site, thereby preventing its binding. The substance can 
also bind the histone acetyUransferase site on p300/CBP (aa 1 195-1673 on p300 or aa 

10 1 174-1850 on CBP) or at the acetylation site on the histone, thereby preventing 
acetylation by p300/CBP. 

Additionally, the substance can be a nucleic acid which can be expressed in the 
cell to produce a protein which inhibits the histone acetyltransferase activity of 

1 5 p300/CBP. For example, a nucleic acid encoding a protein which binds either to a 
transcription factor or the p300/CBP protein and has the net effect of inhibiting the 
histone acetyltransferase activity of p300/CBP in the cell can be delivered to a cell in the 
subject by gene transduction mechanisms well known in the art. For example, nucleic 
acid can be introduced by liposomes as well as via retroviral or adeno-associated viral 

20 vectors, as described below 

The substance which inhibits the histone acetyltransferase activity of p300/CBP 
can be an antisense RNA or an antisense DNA which binds the RNA or DNA of 
p300/CBP thereby preventing translation or transcription of the RNA or DNA encoding 
25 p300/CBP and having the net effect of inhibiting the histone acetyltransferase activity of 
P/CAF by inhibiting p300/CBP production The antisense RNA or DNA of the present 
invention can be produced and introduced into cells according to the same methods as 
set forth above for P/CAF antisense nucleic acids. 



30 



The present invention also contemplates a method for stimulating the histone 
acetyltransferase activity of p300/CBP in a subject comprising administering to the 
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subject a histone acetyltransferase activity stimulating amount of a substance, in a 
pharmaceutically acceptable carrier, determined according to the methods taught herein, 
to have a stimulatory affect on the histone acetyltransferase activity of p300/CBP. The 
substance can exert a stimulatory effect by promoting the binding of a DNA-binding 
.5 transcription factor of the present invention to p300/CBP The substance can be 

delivered to a cell in the subject by mechanisms well known in the art. A nucleic acid 
encoding a protein which stimulates the transcription modulating activity of p300/CBP 
can be delivered to a cell in the subject by gene transduction mechanisms, as described 
below. 

Gene transduction 

In the methods described above which include gene transduction into cells (i.e., 
addition of exogenous DNA into cells), the nucleic acids of the present invention can be 
in a vector for delivering the nucleic acids to the site for expression of the P/CAF 

1 5 protein. The vector can be one of the commercially available preparations, such as the 
pGM plasmid (Promega). Vector delivery can be by liposome, using commercially 
available liposome preparations or newly developed liposomes having the features of the 
present liposomes. Additionally, vector delivery can be via a viral system, including, but 
not limited to, retroviral, adenoviral and adeno-associated viral systems. Other delivery 

20 methods can be adopted and routinely tested according to the methods taught herein. 

The modes of administration of the liposome will vary predictably according to 
the disease being treated and the tissue being targeted. For example, for treating cancer 
in either the lung or the liver, which are both sinks for liposomes, intravenous delivery is 

25 reasonable For other localized cancers, as well as precancerous conditions, 

catheterization of an artery upstream from the target organ is a preferred mode of 
delivery, because it avoids significant clearance of the liposome by the lung and liver. 
For cancerous lesions at a number of other sites (e.g., skin cancer, localized dysplasias), 
topical delivery is expected to be effective and may be preferred, because of its 

30 convenience 
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Leukemias and other disorders involving dysregulated proliferation of certain 
isolatable cell populations may be more readily treated by ex vivo administration of the 
nucleic acid. 



5 The liposomes may be administered topically, parenterally (e.g., intravenously), 

by intramuscular injection, by intraperitoneal injection, transdermally, extracorporeally 
or the like, although intravenous or topical administration is typically preferred The 
exact amount of the liposomes required will vary from subject to subject, depending on 
the species, age, weight and general condition of the subject, the severity of the disease 

10 being treated, the particular compound used, its mode of administration and the like 
Thus, it is not possible to specify an exact amount. However, an appropnate amount 
may be determined by one of ordinary skill in the art using only routine expenmentation 
given the teachings herein. 

15 Parenteral administration, if used, is generally characterized by injection 

Injectables can be prepared in conventional forms, either as liquid solutions or 
suspensions, solid forms suitable for solution of suspension in liquid prior to injection, or 
as emulsions. A more recently revised approach for parenteral administration involves 
use of a slow release or sustained release system such that a constant level of dosage is 

20 maintained See, e.g., U.S. Patent No. 3,610,795, which is incorporated by reference 
herein. 

Topical administration can be by creams, gels, suppositories and the like Ex 
vivo (extracorporeal) delivery can be as typically used in other contexts 

25 

The present invention is more particularly described in the following examples 
which are intended as illustrative only since numerous modifications and variations 
therein will be apparent to those skilled in the art. 



30 
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EXAMPLES 

1. P/CAF studies. 

5 Cloning and characterization of P/CAF protein. 

In human cells, GBP binds to c-Jun in a phosphorylation-dependent manner in 
association with stimulation of transcription (9). In yeast, GCN4 is believed to be a c- 
Jun counterpart on the basis of similarities in DNA recognition (15) as well as the 
participation of both proteins in UV signaling pathways (16). Yeast genetic screening 

10 has led to the isolation of various cofactors for GCN4, including GCN5 (yGCN5), 
ADA2 (yADA2) and AD A3 (yADA3) (17-19). These factors are considered to 
function as a complex (or in a common pathway) based on genetic and protein-protein 
interaction studies (18-22). Finally, p300/CBP and yADA2 exhibit significant sequence 
similarity within a 50 amino acid region including a Xv^^ finger motif (3). Human 

15 : counterparts to yGCN5, yADA2, or yADA3 that interact with p300/CBP to mediate 
transcriptional activation by c-Jun were searched for in various nucleotide sequence 
databases. 

Comparison of the yGCN5 protein sequence with various databases (23) 
20 revealed significant similarities with the two randomly sequenced human cDNAs, 

ETS05G39 (24) (P=4.0xl0'^^) and NIB2000-5R (P=6.5xlO-^). Given that these cDNAs 
were truncated, human fetal liver and fetal brain cDNA libraries (Clontech) were 
screened with ETS05039 and NIB2000-5R, respectively and complete clones were 
isolated ft-om the human fetal liver cDNA library. The complete sequences revealed that 
25 the ETS05039- and NIB2000-5R-derived clones are encoded by distinct genes but are 
highly related within the protein coding regions (68% identity at the DNA level, 75% 
identity and 86% similarity at the protein level). The former encodes an N-terminal 
region with no sequence similarity to any proteins in the databases besides the yGCN5- 
related C-terminal region, whereas the latter encodes only the yGCN5-related region. 
30 Given that p300/CBP-binding activity was observed in the former polypeptide as shown 
below, it was designated p300/CBP-associated factor (P/CAF), having the amino acid 
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sequence of SEQ ED NO: l and the nucleotide sequence of SEQ ID NO: 10 and the latter 
was named human GCN5 (hGCNS), having the amino acid sequence of SEQ ID NO . 5 
and the nucleotide sequence of SEQ ED NO: 11. 

5 Additionally, an RNA blot (Clontech) was hybridized with a random-primed 

probe made from the cDNA encoding P/CAF. RNA blotting indicated that transcripts 
detected by the P/CAF and hGCN5 cDNAs are ubiquitously expressed, but the former is 
most abundant in heart and skeletal muscle, whereas the latter is most abundant in 
pancreas and skeletal muscle 

P/CAF-p300/CBP interaction in vitro 

The P/CAF binding site was presumed to reside in the C terminal one third of 
CBP (residues 1,678-2,442) because it was observed that this region, when fused to a 
DNA binding domain, activates transcription (4) in a manner repressed by coexpression 
5 of 12S El A. This region was divided into 6 overiapping fragments and each was 
expressed in K colt as a glutathione-S-transferase (GST) fusion protein. GST-CBP 
fusions were incubated with recombinant P/CAF protein and, subsequently, purified 
using glutathione-Sepharose. Co-purified P/CAF was detected by immunoblotting 
analysis. 

20 

To construct GST-fiisions, various regions of CBP and p300 were amplified by 
PCR. A series of deletions of the CBP segment B was created by site-directed in vitro 
mutagenesis (30). These fragments were subcloned into pGEX-2T (Pharmacia). GST- 
flisions were expressed in £ coli and extracted with buffer B [20 mM Tris-HCl (pH 
25 8.0), 5 mM MgClj, 10% glycerol, 1 mM AEBSF, 0.1% NP40, 10 ^ig/ml of aprotinin, 10 
^g/ml of leupeptin, 1 ^g/ml of pepstatin A, 1 mM DTT] containing 0. 1 M KCl for these 
experiments GST-CBP-segment B was purified by glutathione-Sepharose and phenyl- 
Sepharose chromatographic steps, P/CAF, hGCNS, and El A were expressed as FLAG- 
fusions in Sf9 cells via baculovirus vectors and affinity-purified with M2-agarose (ref 
30 30; Kodak-EBI). For interaction, a crude E. coli extract containing 20 pmol of GST- 
fusion was incubated with 40-60 pmol of P/CAF or El A in a total volume of 50 \x\ of 



wo 98/03652 




PCTyUS97/12877 



43 



buffer B with 0.1 M KCI on ice for 10 min. Samples were further incubated with 10 |il 
(packed volume) of glutathione-Sepharose at 4° C for 30 min, washed four times with 
200 \x\ of buffer B containing 0 1 M KCI, and eluted with 20 ^1 of buffer E [50 mM 
Tris-HCl (pH 8.0), 0.2 M KCI, 20 mM glutathione] for 60 min. Interacting proteins 
5 were detected by anti-FLAG immunoblotting or silver staining. 

. For p300 interactions, the segment spanning residues 1763-1966 (segment B*) of 
p300, which is analogous to the CBP segment-B, was used. Twenty percent of the 
P/CAF and hGCN5 inputs and 100% of the El A input were also analyzed. In the GST 
10 precipitation assays, almost identical amounts of the GST fusions were recovered in all 
samples. Interaction between P/CAF and CBP (segment B) was determined in the 
absence and in the presence of El A. Control reactions with GST-CBP alone and 
without GST-CBP were also performed. Input proteins were analyzed. 

1 5 Two CBP segments, A and B, interacted specifically with P/CAF. The stronger 

interaction was observed in the latter segment, which does not include the yADA2-like 
Zn^^ finger Given that the CBP segment-B is well conserved in p300 (66% identity, 
75% similarity), the binding of P/CAF to p300 in vitro was also analyzed. For this 
-experiment, the p300 segment spanning residues 1763-1966, termed segment B', which 

20 is analogous to the CBP segment-B, was used. Like CBP, p300 interacted specifically 
with P/CAF. These studies demonstrated that P/CAF binds specifically to both p300 
and CBP in vitro In contrast to P/CAF, hGCN5 did not bind to CBP or p300. 

These studies also demonstrated that the Zn^^ finger region of p300/CBP, which 
25 shares sequence similarity with yADA2, is not essential for the interaction with P/CAF 
Cloning of a human structural homolog of yADA2, termed hADA2 (25) has revealed 
that, unlike the sequence similarity between p300/CBP and yADA2, which is restricted 
to a 50 amino acid region, hADA2 shares extensive similarity (30% identity, 52% 
similarity) to yADA2 over the entire protein sequence. Moreover, a computer search of 
30 the complete genomic sequence of Saccharomyces cerevisiae revealed that yeast does 
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not have counterparts of p300/CBP or P/CAF. Thus, the p300/CBP-P/CAF pathway 
may have been acquired during metazoan evolution. 

5 Action of ElA irt vi>o 

Previous reports indicated that El A binds to both the p300 segment spanning 
residues 1767-1816 and the CBP segment spanning residues 1805-1854 (7). These 
interactions were reconfirmed in the present system; thus, both p300 and CBP segments 
covering the previously identified regions interacted with El A 

10 

For further mapping, a series of deletions was introduced within the CBP 
segment-B and tested for interactions with P/CAF and El A. Deletions of residues 
1801-1825 or 1824-1851 markedly reduced interactions with both P/CAF and ElA, 
whereas deletion of residues 1850-1878 did not affect these interactions. Furthermore, 
15 deletion of residues 1801-1851 completely abolished interactions with both P/CAF and 
ElA -these data indicate that residues 1 80 1 - 1 8 5 1 of CBP are critical for interaction 
with both P/CAF and ElA. Taken together with the evidence that CBP segment A (aa 
residues 1 ,678- 1 ,880) also binds to these factors, the above findings demonstrate that 
PrCAF and ElA bind to the same or very closely spaced sites on CBP. 

20 

Evidence that both P/CAF and E 1 A recognize the same p300/CBP segments 
raises the possibility of direct competition between P/CAF and El A for binding to 
p300/CBP. To test this possibility, a competition experiment was performed with the 
use of affinity purified recombinant proteins. The interaction of P/CAF with the CBP- 
25 segment B was progressively inhibited by the addition of increasing amounts of E 1 A. Ln 
contrast, no inhibition was caused by an ElA mutant which does not bind to p300/CBP 
(E 1 AAN). Similar results were obtained with the p300-segment B', leading to the 
conclusion that P/CAF and El A compete for the same binding sites in p300/CBP 
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P/CAF-p300/CBP interaction //I vjfw 

The in vivo interaction between P/CAF and p300/CBP was established by co- 
immunoprecipitation from a human osteosarcoma cell extract. Proteins in this extract 
were immunoprecipitated with rabbit anti-P/CAF, rabbit anti-CBP and anti-p300 
5 antibodies. For controls, cell extract was precipitated with rabbit control IgG or mouse 
anti-HA monoclonal antibody. The precipitates were analyzed by inununoblotting with 
anti-P/CAF, anti-CBP and anti-p300 antibodies. 

Osteosarcoma cells were transfected with either control vector or El A- or 
10 El AAN-expression vectors. Extract from the transfected subpopulation was 

immunoprecipitated with anti-P/CAF or control IgG. The precipitates were analyzed by 
immunoblotting with anti-p300 and anti-P/CAF antibodies. 



Rabbit anti-P/CAF antibody was raised to the P/CAF segment spanning residues 

15 125-397 and purified by immunoaffinity chromatography (33). A mixture of 

rnonoclonal antibodies raised to the human p300 segment spanning residues 1572-2371 
(5) and rabbit polyclonal antibodies raised to the mouse CBP segment spanning residues 
2-23 (for immunoprecipitation) and 1736-2179 (immunoblotting) were purchased from 
Upstate Biotechnology. Approximately 2x10^ human osteosarcoma U-2 OS cells 

20 (ATCC accession number HTB 96) were extracted with 10 ml of lysis buffer [25 mM 
HEPES-KOH (pH 7.2), 150 mM potassium acetate, 2 mM EDTA, 1 mM DTT, I mM 
AEBSF, 10 ^g/ml of aprotinin, 10 ^g/ml of leupeptin, 1 jig/mJ of pepstatin A, 20 mM 
sodium fluoride, 0. 1% NP40]. Two to 10 ml of extract were incubated with 2 ^g of the 
respective antibody for four hours at 4°C. Fifty ^l (packed volume) of protein-A 

25 Trisacryl (Pierce) were added and incubation was continued for two hours. The matrix 
was washed four times with 1 ml of the lysis buflFer, then boiled in 2x SDS sample 
buffer. Human osteosarcoma U-2 OS cells were transfected with 20 ^g of the indicated 
plasmid and 1 |ig of sorting plasmid (pCMV-IL2R) (3 1 ). The transfected subpopulation 
was purified by magnetic affinity cell sorting (32). Extract from approximately 2x10^ 

30 sorted cells was immunoprecipitated as described. 
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Anti-P/CAF antibody specifically detected a 95 kDa protein, which is very close 
to the calculated value for the full-length P/CAF, in the immunoprecipitates Anti- 
P/CAF antibody co-immunoprecipitated both CBP and p300. Similarly. anti-CBP 
antibody also co-immunoprecipitated P/CAF. However, anti-p300 antibody did not co- 
5 immunoprecipitate P/CAF. This is most likely due to steric interference since the anti- 
p300 antibody was raised to the p300 segment spanning residues 1 572-2371 which 
includes the P/CAF binding region. These data demonstrate that P/CAF forms 
complexes with both p360 and CBP in vivo. 

10 Action of El A in vivo 

The in vitro experiments described herein indicate that P/CAF and El A compete 
for the binding sites in p300/CBP. Thus, a study was conducted to determine whether 
El A targets the endogenous interaction between P/CAF and p300 An E 1 A-expression 
vector was transiently transfected into human osteosarcoma cells and the transfected 

15 subpopulation was purified by cell sorting. Then, the interaction between P/CAF and 
p300 in transfected cells was examined by co-immunoprecipitation with anti-P/CAF 
antibody. The endogenous interaction of P/CAF udth p300 was drastically inhibited by 
expression of El A. On the other hand, no inhibition was observed by the El A mutant 
lacking the p300 binding domain (El AAN), indicating that El A disrupts the P/CAF- 

20 p300 complex in vivo through an interaction with p300. 

Cell cycle regulation by P/CAF 

Given that binding of P/CAF to p300/CBP is inhibited by El A, experiments 
were performed to evaluate whether P/CAF, by binding to and forming a functional 
25 complex with p300, is involved in the regulation of entry into S phase. This possibility 
was addressed by examining whether transient expression of P/CAF would affect the 
rate of Gl/S transit in HeLa cells. P/CAF negatively affected the distribution of cells 
between Gl and S phases in this assay. 



30 



HeLa cells were transfected by electroporation v^th 7 ^g of P/CAF-expression 
plasmid and/or 3 ng of the fiill-length or the N-terminally deleted (A2-36) El A 12S- 
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expression plasmid as indicated. These plasmids were constructed by subcloning 
FLAG-P/CAF and El A cDNAs into pCX (34) and pcDNAI (Invitrogen), respectively. 
All samples, in addition, contained 1 jag of sorting plasmid (pCMV-IL2R) (31) and 
carrier plasmid (pCX) to normalize the total amount of DNA to 1 1 \xg. After 
5 transfectipn, cells were incubated in Dulbecco's modified Eagle's medium with 10% fetal 
bovine calf serum for 12 h, and subsequently labeled in medium containing 10 ^M 
bromo-deoxyuridine (BrdU) for 30 min. Subsequently, the transfected subpopulation 
was purified by magnetic affinity cell sorting and nuclei were analyzed by dual parameter 
flow cytometry as described (32). 

10 

The ft*action of cells accumulating in S phase in control cultures was 23%, 
compared to 15% in P/CAF-transfected cells. This effect was reproducible in multiple 
independent experiments. In parallel experiments to verify the utility of this 
experimental protocol, plasmids encoding E2F-1, simian virus 40 small t, cyclin A or 
1 5 cyclin E increased the accumulation of cells in S phase, whereas plasmids encoding the 
cyclin-dependent kinase inhibitors p21 or p27 reduced the number of S phase cells 

On the basis of evidence that El A and P/CAF compete for binding sites on 
p300, it seemed possible that cotransfection of P/CAF with El A would oppose the 

20 mitogenic effect caused by El A. As shown by the data herein, this is indeed the case. 
El A alone has mitogenic activity in this experimental setting, while the El A mutant 
lacking the p300 binding domain (El AAN) has very weak activity. Comparable 
expression levels between wild type and mutant El A in the transfected cells were 
revealed by immunoblotting analysis with anti-El A. Intriguingly, when P/CAF was 

25 cotransfected with El A, the mitogenic activity of El A was significantly counteracted by 
P/CAF. These results show that P/CAF and El A mediate antagonistic effects on cell 
cycle progression 

In the course of assessing P/CAF activity, it was also revealed that p300 is able 
30 to inhibit cell cycle progression under the same assay conditions. These findings suggest 
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that P/CAF and p300, perhaps by forming a complex, act in concert to suppress cell 
cycle progression. 

Histone acetyltransferase activity in P/CAF 

5 Acetylation of the N-teiminal histone tails has been considered to play a crucial 

role in accessibility of transcription factors to nucleosomal templates (26-27). Recently, 
yGCNS has been identified as a histone acetyltransferase (28). On the basis of this 
information, intrinsic histone acetyltransferase activity in P/CAF and hGCN5 was 
examined- As substrates, the core histones (histones H2A, H2B, H3 and H4) and the 

10 nucleosome core particles (146 base pairs of DNA wrapped around the octamer of core 
histones) were used. 

Activity of hGCN5 and P/CAF that acetylates free histones or histones in the 
nucleosome core particle (35) was measured as described (36). Each reaction contained 
15 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol of the histone 
octamer or the nucleosome core particle and 10 pmol of [l-^'*C]acetyl-CoA The 
histone octamer dissociated into dimers or tetramers under assay conditions Acetylated 
histones were detected by autoradiography after separation by SDS-PAGE. 

20 P/CAF and hGCN5 acetylated the core histones with almost the same efficiency 

Both factors acetylated histones H3 and H4, but preferentially H3. In contrast, very 
weak or no acetylation by h(jCN5 was detected in the nucleosome core particles. 
Remarkably, significant acetylation by P/CAF was observed in a nucleosomal context. 
Although all core histones are acetylated in the nucleus, P/CAF and hGCN5 did not 

25 acetylate histones H2A and H2B in vitro. 

Direct function of P/CAF is likely to involve its intrinsic histone acetyhransferase 
activity. Although exact molecular mechanisms by which acetylation of core histones 
contribute to transcription remains undefined, acetylation of the histones is considered to 
30 play an important role in transcriptional regulation (26-27). The positively charged N- 
terminal tails of core histones are believed to affect nucleosome structure by interacting 
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with DNA at or near the nucleosome-spacer junction. Acetylation of the histone tails 
presumably destabilizes the nucleosome and facilitates access by regulatory factors. 
Likewise, there is a general correlation between the level of acetylation and 
transcriptional activity of nucleosomal domains. The findings of the present invention 
5 provide insights into the mechanisms of targeted histone acetylation. 

Cellular factor p300/CBP binds to various sequence-specific factors that are 
involved in cell growth and/or differentiation, including CREB (3,4), c-Jun (9), Fos (1 1), 
c-Myb (12) and nuclear receptors (13). P/CAF could stimulate the activation fiinction 
10 of these factors via promoter-specific histone acetylation. The present invention 

demonstrates that El A appears to perturb normal cellular regulation by disrupting the 
connection between p300/CBP and its associated histone acetyltransferase. 

n. D300/CBP studies. 

15 

Purification of El A associated histone acetyltransferase, 

FLAG-epitope tagged El A (or AEl A) was expressed in Sf9 ceils (ATCC 
accession number CRL 1711) by infecting recombinant baculovirus (43). All purification 
steps were carried out at 4°C. Extract was prepared ft*om infected cells by one cycle of 

20 freeze and thaw in buffer B (20 mM Tris-HCl, pH 8.0; 5 mM MgClj, 10% glycerol; 1 
mM PMSF; 10 mMp-mercaptoethanol, 0. 1% Tween 20) containing 0. 1 
M KCl and the complete protease inhibhor cocktail (Boehringer Mannheim). To 
prepare ElA-immobilized beads, the extract was incubated with M2 
anti-FLAG antibody agarose (Kodak-IBI) for four hours with rotating and 

25 subsequently washed with the same buffer three times. The resulting beads were 

incubated with HeLa (ATCC accession number CCL 2) nuclear extract for four to eight 
hours and thereafter washed with the same buffer six times. Finally, FLAG-El A was 
eluted fi-om the beads along with associated polypeptides by incubating with the same 
buffer containing 0. 1 mg/ml FLAG peptide. 
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For ftirther purification, eluted polypeptides were dialyzed in 0.05 M KCl-bufTer 
B and subsequently loaded onto a SMART Mono Q colunm (Pharmacia) equilibrated 
with the same 0.05 M KCl-buffer B. After washing, the column was developed with a 
linear gradient of 0.05-1 .0 M KCl in buffer B. Mono Q fractions were concentrated with 
5 a MICROCON spin-filter (Amicon) and consequently loaded onto a SMART Superdex 
200 column (Pharmacia) equilibrated with 0. 1 M KCl-buffer B. 



Histone acctyltransferase assays 

Filter binding assays were performed as described (80) with minor modifications 
10 Samples were incubated at 30°C for 10-60 minutes in 30 ml of assay buffer containing 
50 mM Tris-HCl, pH 8.0; 10% glycerol; 1 mM DTT; I mM PMSF; 10 mM sodium 
butyrate, 6 pmol of [^Hjacetyl CoA (4.3 mCi/mmole. Amersham Life Science Inc.), and 
33 mg/ml of calf thymus histones (Sigma Chemical Co.). In experiments where synthetic 
peptides were substituted for core histones, 50 pmol of each peptide were used. After 
15 incubation, the reaction mixture was spotted onto Whatman P-81 phosphocellulose filter 
paper and washed for 30 minutes with 0.2 M sodium carbonate buffer pH 9.2 at room 
temperature with 2-3 changes of the buffer. The dried filters were counted in a liquid 
scintillation counter. 

20 PAGE analysis was done as above except that 90 pmol of [^^CJacetyl CoA (55 

mCi/mmole, Amersham Life Science Inc.) and 9 pmol of core histones or 
mononucleosomes were used. Core histones and mononucleosomes were prepared as 
described (35). For trypsin digestion, reaction mixtures were fiarther incubated with 
various amounts of trypsin on ice for 30 minutes. The samples were analyzed on one 

25 dimensional SDS-PAGE gels or two dimensional gels, where the first dimension was an 
acid-urea-PAGE gel (44) and the second dimension was an SDS-PAGE gel. 

Protein expression 

For baculovirus expression, cDN As corresponding to p300 portions of aa 1 -670, 
30 aa 671-1 194 and aa 1 135-2414 were amplified by PCR (EXPAND High Fidelity PCR 
System; Boehringer Mannheim) as KpnI-NotI fi-agments. The resulting fragments were 
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subcloned into a baculovirus transfer vector having the FL AG-tag sequence (43). The 
recombinant viruses were isolated using the BACULOGOLD system (Pharmingen), 
according to the manufacturer's protocol and were infected into Sf9 cells (ATCC 
accession number CRL 171 1) to express FLAG-p300. Recombinant proteins were 
5 aflfinity purified with M2 anti-FLAG antibody-immobilized agarose (Kodak-EBI) 
according to the manufacturer's protocol. 

For bacterial expression, cDNAs encoding the p300 portions and the CBP 
portion (aa 1 174-1850) were first subcloned into the baculovirus transfer vector having 
10 the FLAG-tag as described above. Thereafter, the Xhol and NotI ft-agments encoding 
FLAG-p300 or FLAG-CBP fusions were resubcloned into the£. coli expression vector 
pET-28c (Novagene) digested with Sail and Notl. Recombinant proteins were 
expressed in E. coli BL21(DE3) and affinity purified with M2-antibody agarose. 

15 Histone acetyltransferases that associate with El A 

Although the adenovirus El A 12S protein (El A) inhibits transcription in a 
variety of genes via direct binding to p300/CBP (45), El A also stimulates transcription 
in some contexts (46). Thus, p300/CBP-bound El A was tested to determine whether it 
might recruit histone acetyltransferases or deacetylases to regulate transcription In 
20 addition, experiments were conducted as described below to determine if p300/CBP per 
se is a histone acetyl transferase. 

Initially, recombinant FLAG-epitope tagged El A was immobilized on 
anti-FLAG antibody beads. Immobilized El A was incubated with a HeLa nuclear 

25 extract for affinity purification of E 1 A-associated polypeptides. FLAG-E 1 A 
was then eluted from the beads, along with El A-associated polypeptides, by 
incubating with FLAG-peptide. Although El A per se has no histone acetyltransferase 
activity, E 1 A recruited significant amounts of histone acetyltransferase activity from the 
nuclear extract. It is very unlikely that this activity is derived from P/CAF given that 

30 El A and P/CAF cannot bind to p300/CBP simultaneously (43) Consistent with this, no 
P/CAF was detected in these fractions by immunoblotting. 



wo 98/03652 




'PCtaiS97/12877 



The El A N-terminus, a region that is not highly conserved among the various 
adenovirus serotypes, is involved in p300/CBP binding in vivo. Mutations in the 
N-terminal region lead to loss of the ability for p300/CBP binding without affecting RB 

5 binding (1 ,47). Thus, the requirement of the El A N-terminal region for the recruitment 
of histone acetyltransferase activity was tested. In contrast to the wild type, the 
N-terrninal deleted form of El A (AN-El A) recruited only a background level of 
acetyhransferase activity. In agreement with previous reports (47), the AN-El A 
showed no ability to interact with p300/CBP, although it still retained the ability to 

10 interact with a variety of other polypeptides, including RB. 

To define the relationship between p300/CBP and histone acetyltransferase 
activity, affinity purified El A-binding polypeptides were separated by Mono Q 
ion-exchange column. Both p300/CBP and the acetyltransferase activity were coeluted 
15 at 140 mM KCl, while most of polypeptides were eluted at 260 mM KCl. The active 
fi-action of Mono Q column (--140 mM KCl) was fijrther separated by Superdex-200 gel 
filtration column. Both p300/CBP and the acetyltransferase activity coeluted after the 
void volume, indicating that p300/CBP is involved in the histone acetyltransferase 
activity. 

20 

p300 is a histone acetyltransferase 

The data provided herein indicate that p300 per se, or a polypeptide(s) 
associated with p300, possesses histone acetyltransferase activity. To test the former 
possibility, the acetyltransferase activity of recombinant p300 was measured. p300 was 
25 divided into three fi-agments, each of which was expressed in and purified from Sf9 cells 
via a baculovirus expression vector. Histone acetyltransferase activity was readily 
detected in the C-terminal fragment containing amino acids 1 135-2414, whereas no 
activity was found in the other fragments, demonstrating conclusively that p300 per se is 
a histone acetyltransferase. 



30 
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p300/CBP-histone acety (transferase domain 

To map the histone acetyltransferase domain of p300, a series of deletions 
was prepared. Given the poor conservation of the glutamine-rich region (aa 1815-2414) 
in the C elegans p300/CBP homolog (6), the p300 fragment encoding aa 1135-1810 
5 was expressed in and purified from E, coli. Importantly, this candidate region of p300 
(aa 11 35-1 8 10) showed significant histone acetyltransferase activity. For further 
mapping within this region, a series of N-terminal deletions was constructed. Deletion 
of 60 residues, resulting in a fragment containing aa 1 195-1810, had no eflFect on the 
acetyhransferase activity, whereas the deletion of 185 residues, yielding a fragment 
10 comprising aa residues 1320-1810, completely eliminated the acetyltransferase activity- 
Next, a series of C-temninal deletions was analyzed to determine the requirement 
of the P/CAF (or El A) -binding domain. The p300 fragments lacking the El A binding 
domain (aa 1 195-1760, 1 195-1706 and 1 195-1673) still retained the acetyltransferase 
15 activity, whereas the further truncated mutant (aa 1 195-1652) completely lost the 

^ acetyltransferase activity. Consistent with these results, the internal deletion of residues 
. 1418-1720 showed no acetyltransferase activity. These data demonstrate that the 
histone acetyltransferase domain is located between the bromodomain and the 
•-El A-binding domain. Given that the histone acetyltransferase domain is highly 
20 conserved between p300 and CBP (91% similarity), the corresponding region of CBP, 
aa residues 1 174-1850, was expressed to confirm the acetyltransferase activity. As 
expected, comparable activity was detected, indicating that both p300 and CBP are 
histone acetyltransferases. 

25 Among various acetyltransferases including histone acetyltransferases GCN5 and 

P/CAF, putative acetyl-CoA binding sites are conserved (48). However, multiple 
alignment analysis (49) showed that the p300/CBP histone acetyltransferase domain 
does not belong to this group. Moreover, comparison of the p300/CBP histone 
acetyltransferase domain with peptide sequence databases (23) showed no sequence 

30 similarity to any other proteins. Accordingly, this invention shows that p300/CBP 

represents a novel class of acetyltransferases in that it does not have the conserved motif 
found among previously described acetyhransferases (48). 
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p300 acctylates all core histoncs in mononucleosomes 

Substrate specificity for acetylation by p300 was also examined As substrates, 
histone octamers and mononucleosomes (146 base pairs of DNA wrapped around the 
octamer of core histones) were used. Given that the histone octamer dissociates into 
5 dimers or tetramers under physiological conditions, the histone octamer is referred to 
here as core histones. When core histones were used, p300 acetylated all four proteins, 
but preferentially H3 and H4. More importantly, in a nucleosomal context, p300 
acetylated all four core histones nearly stoichiometrically. In contrast, p300 acetylated 
neither BSA nor lysozyme. 

10 

Hyperacetylated histones are believed to be linked with transcriptionally active 
chromatin (26,27,50,51). Hyperacetylated forms are found in histones H4, H3 and H2B, 
which have multiple acetylation sites in vivo Thus, the level of acetylation by p300 was 
also tested. 

15 

Mononucleosomes treated with p300 were analyzed by two-dimensional gel 
electrophoresis. A Coomassie blue-stained gel and the corresponding autoradiogram 
showed that a significant amount of histones, especially H4, were hyperacetylated. 
Importantly, acetylation levels by p300 were very close to those of hyperacetylated 
20 histones prepared from HeLa nuclei treated with sodium butyrate, a histone deacetylase 
inhibitor. In contrast, no acetylated forms were detected in the reaction 
^ without p300. These results indicate that p300 acetylates histones in mononucleosomes 
to the hyperacetylated state by targeting multiple lysine residues. 



p300 acetylates the four lysines in the histone H4 N-terminal tail in vitro which are 
acetylated in vivo 

Lysines at positions 5, 8, 12 and 16 of histone H4 are acetylated in vivo 
(51). Recent studies with yeast histone acetyltransferases demonstrate the 
30 position-specific acetylation by distinct acetyltransferases, i.e., while cytoplasmic 
acetyltransferases for histone deposition and chromatin assembly modify 
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positions 5 and 12, GCN5 modifies positions 8 and 16 (52). Accordingly, the positions 
of acetylation by p300 were also determined. A series of synthetic peptides containing 
acetylated lysines at various positions was used to determine the acetylation 
site-specificity of p300. Consistent with the two-dimensional gel electrophoresis 
5 analysis, the experiments with peptide substrates showed that p300 acetylates all four 
lysines in the histone H4 that are acetylated in vivo. These results are consistent with the 
view that deposition-related diacetylated histones are deacetylated during maturation 
of chromatin (53). 

10 p300 preferentially acetylates the N-terminal histone tail 

Histone acetyltransferases modify specific lysine residues in the N-terminal 
tail of core histones but not the C-terminal globular domain in vivo (26,27,50,5 1). 
Structural models of nucleosomes (54,55,56) suggest that most of the lysine residues in 
the C-terminal globular domain are buried. Therefore, experiments were conducted to 

1 5 examine whether restricted acetylation of the N-terminal tail resulted fi-om the substrate 
. specificity of the enzyme or inaccessibility of the enzyme to the core domain in 

nucleosomes. The globular domains of all core histones contain a long helix flanked on 
either side by a loop segment and short helix, termed the "histone fold" (54,55,56). 
The histone fold is involved in formation of the stable H2A-H2B and H3-H4 

20 hetero-dimers, consisting of extensive hydrophobic contacts between the paired 

molecules. Therefore, it is likely that a histone monomer cannot fold properly, thereby 
increasing access of the histone acetyltransferase to the core domain. Based on these 
considerations, experiments were conducted to determine whether p300 acetylates fi-ee 
histone H4 in a N-terminal-specific manner. 

25 

Histone H4 was acetylated with p300 and subsequently the histone tail was 
removed by partial digestion with trypsin. The distributions of radioactivity between 
intact and core histones were compared. While the globular core histone domain was 
predominant at the higher trypsin concentrations, radioactivity was detected mostly in 
30 the intact histone. These data demonstrate that p300 preferentially acetylates the 
N-terminal tail of histone H4. 



wo 98/03652 




PCT/US97/12877 



5 m. P/CAF interaction with MyoD 

Tissue culture and transfection experiments 

C2C,2 mouse cells (ATCC accession number CRL 1772) were grown in 
Dulbecco's modified Eagle medium (DMEM) supplemented with 20% fetal bovine 

10 serum (FBS) until they reached confluence. Differentiation was induced by switching 
medium to differentiation medium (DM), consisting of DMEM containing 2% horse 
serum. C3H/IOTI/2 fibroblasts (ATCC accession number CCL 226) were grown in 
DMEM supplemented with 10% FBS. Cells were transfected by the calcium phosphate 
precipitation method. Total amounts of transfected DNA were equalized by empty 

1 5 vector DNA. After 12 h incubation in medium containing the precipitated DNA the 
cells were washed and incubated in fi-esh DMEM containing 10% FBS for an additional 
24 h. Afterwards, differentiation was induced by incubating in DM for 36 to 72 h. 
Chloramphenicol acetyltransferase (CAT) assays were performed as previously 
described (64,69). The quantities of cell extracts used for CAT assays were normalized 

20 toP-galactosidase activity by cotransfection of 1 mg of the P-galactosidase expression 
vector, pON260. 

Expression vectors used for transfection experiments are as follows: 
pCX-P/CAF for P/CAF (43); pCMV-bp300 for p300 (65), pCMV-p300 (1869-2414) 
25 (64) and pCMV-p300 (1514-1922) (60) for p300 wild type and mutants, pEl A12S, 
pElA12S R2G, pElA12S D2-36 and pEl A12S 0121-130 for El A wild type and 
mutants (66,67,68); and pEMSV-MyoD for MyoD (64) 



30 



The antisense P/CAF RNA expression vector, pcDNA3 P/CAF-AS, was created 
as follows. The 2.5 Kb EcbRI-Kpnl fragment containing the entire P/CAF open reading 
frame was isolated from pCX-P/CAF (43). This fragment was subcloned into the 
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EcoRI-Kpnl sites of plasmid pcDNAJ (Invitrogen) so that the antisense P/CAP RNA is 
driven under the CMV promoter. Reporter genes employed were 4RE-C AT and 
MCK-CAT (69). 4RE-CAT is driven by a synthetic promoter containing 4 copies of the 
E-box, whereas MCK-CAT is driven by the native MCK promoter (nucleotides -1256 to 
5 +7). 

Microinjection and immunofluorescence 

Cells were grown on small glass slides, subdivided into numbered squares of 2 
mm X 2 mm and microinjected with purified and concentrated antibodies, as previously 

10 described (70). For immunofluorescence, cells were fixed in either 2% 

paraformaldehyde or 1 :2 methanol/acetone solution, preincubated with 5% BS A/PBS 
and incubated with the primary antibodies for 30 min at 37"* C. Subsequently, antibody 
was visualized by incubating with either rhodamine- or fluorescein-conjugated secondary 
antibody for 30 min at 37° C. Injected antibodies were stained with a 

15 rhodamine-conjugated secondary antibody and nuclei were counter-stained by DAPI as 
previously described (69). 

Antibodies employed are as follows; rabbit polyclonal affinity purified 
anti-P/CAF antibody (43), rabbit polyclonal anti-p300/CBP antiserum (71), mouse 
20 monoclonal anti-MyoD antibody (clone 5. 8 A, kindly provided by P Houghton), goat 
polyclonal anti-c-Jun affinity purified antibody (Santa Cruz) and rabbit pre-immune 
serum. 



25 

Immunoprecipitation and DNA afTinity puriflcation 

Cells were resuspended in lysis buffer (20 mM NaP04, 150 mM NaCl 5mM 
MgCl2, 0. 1% NP40, 1 mM DTT, 10 mM sodium fluoride, 0. 1 mM sodium vanadate, 1 
mM phenylmethylsulfonyl-fluoride and 10 mg/ml each of leupeptin, aprotinin and 
30 pepstatin). After 30 min incubation on ice, samples were centrifuged at 12,000 x g for 
30 min and supemat ants were used as cell extracts. Extracts were pre-cleared by 
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incubating with rabbit pre-immune serum and protein A/G Plus-Agarose (Santa Cruz) 
for 2 h at 4 C. For immunoprecipitation, the supematants were incubated with the 
respective antibodies for 3 h at 4 C. Protein A/G Plus-Agarose was added, and 
incubation continued for 3 h. The matrix was washed with lysis buffer, then boiled in 2 
5 X SDS sample buffer. Immunoblotting was performed by using the ECL 

chemiluminescent detection kit (Amersham) according to the manufacturer's protocol 

Affinity purification of E-box-bound complexes was done as previously 
described (69). Briefly, 100 ng of the biotinylated double stranded DNA contaimng the 
10 E-box were immobilized on streptavidin-conjugated magnetic beads and incubated with 
500 mg of cell extracts in the presence of poly dl-dG AAer extensive washing, bound 
proteins were eluted with SDS sample buffer and analyzed by immunoblotting 

In vitro protein-protein interaction assays 

1 5 The CBP-B fragment and its deletion derivatives were expressed as 

GST-fusions described previously (43). MyoD and El A (43) were expressed as 
FLAG-fusion proteins in Sf9 cells via a baculovirus expression system and 
affinity-purified on M2 anti-FLAG antibody-agarose (Kodak-IBI) Crude E coli 
extracts containing GST-fusions were incubated with various amounts of MyoD and/or 
20 El A in 50 ml of buffer B (20 mM Tris-HCl, pH 8.0, 0 1 M KCl, 5 mM MgCU, 10% 
' glycerol, and 0. 1% Nonidet P-40) on ice for 10 min. GST-precipitation was performed 
as described (43). MyoD and El A were detected by immunoblotting with anti-FLAG 
M2 antibody. For the interaction between P/CAF and MyoD, 1.5 pmol of 
FLAG-P/CAF and 15 pmol of FLAG-MyoD were incubated in 50 ml of buffer B on ice 
25 for 10 min. The mixture was further incubated with 2 mg of anti-P/CAF (43) or 
anti-hADA2 antibody for 60 min. The immunocomplexes were precipitated by 
incubation with 10 ml of protein A-Trisacryr(Pierce) and rotated for 1-4 hr at 4oC The 
matrix was washed 4 times with 200 ml of buffer B and boiled in 10 ml of 2 X SDS 
sample buffer. The proteins were resolved on a 4%-20% gradient SDS-PAGE and 
30 subjected to immunoblotting with the anti-FLAG M2 antibody The blot was developed 
with the SUPERSIGNAL chemiluminescent substrates (Pierce). 



wo 98/03652 




PCT/US97/12877 



59 



P/CAF coactivates muscle-specific transcription 

P/CAF and MyoD were co-transfected into mouse C3H10T1/2 fibroblasts, and 
MyoD-mediated transcription was determined from reporter activity driven by the 
5 artificial (4RE) and the naturally-occurring muscle creatine kinase (MCK) promoters. 
Overexpression of P/CAF stimulated MyoD-dependent transcription several folds in 
both promoters. Similar results were obtained for the myoD activated myogenin 
promoter Transcriptional activation was further stimulated by co-transfecting with 
MyoD, P/CAF and p300 expression vectors, suggesting that P/CAF may fimction by 

10 forming a complex with p300/CBP. Consistent with the lack of DNA binding capacity in 
P/CAF, overexpression of P/CAF alone did not increase the basal transcriptional activity 
of either enhancer. To test whether P/CAF and p300/CBP function in the same pathway, 
two dominant negative forms of p300 were employed which specifically inhibit 
p300/CBP-mediated transcription (60,64). The p300 segment spanning residues 

1 5 15 14-1922 inhibits the MyoD-dependent activation via direct interaction with MyoD 
(60), whereas the p300 segment spanning residues 1869-2414 inhibit it without direct 
interaction (64). Both dominant negative mutants inhibited MyoD-coactivation by 
P/CAF), suggesting that P/CAF and p300/CBP function in the same pathway. 

20 For further elucidation of the activation mechanism by P/CAF, the effect of El A, 

which inhibits MyoD-dependent transcription and differentiation (66,72,73) via direct 
interaction with p300/CBP (65,78), was tested. Expression of El A in C3H10T1/2 
fibroblasts inhibited stimulation of MyoD-directed transcription by P/CAF 
overexpression. El A mutants lacking p300/CBP-binding activity. El AD2-36 and El A 

25 R2G (67,79), had almost no effect. On the other hand, an El A mutant retaining 
p300/CBP-binding activity. El A D 12 1-1 30, behaved like the wild type. Since El A 
associates with p300/CBP, but not with P/CAF, these results suggest that P/CAF 
functions in MyoD-directed transcription via interaction with p300/CBP 

30 To address the role of P/CAF as a myogenic coactivator in a more relevant 

environment, P/CAF was overexpressed in proliferating C2C12 myoblasts which express 
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endogenous myogenic bHLH factors. As obser/ed in fibroblasts, overexpression of 
P/CAF stimulated muscle specific transcription. Concomitant expression of exogenous 
p300 increased P/CAF-mediated coactivation. The repression exerted by wild type El A, 
but not mutant El A D2-36, on P/CAF coactivation of MyoD was also observed in 
5 muscle cells. 

Similar experiments were performed with myogenic cell lines that were stably 
transformed with wild type or mutant El A-expressing vectors (56). Coactivation by 
P/CAF was inhibited by wild type El A or the El A mutant that retains 
10 p300/CBP-binding activity (El AA121-130). In contrast, El A mutants that lack 

p300/CBP-binding (El A A2-36 and El A R2G) allowed transcriptional coactivation by 
P/CAF. Taken together, these experiments show that P/CAF coactivates MyoD-directed 
transcription via interaction with p300/CBP. 

1 5 P/CAF stimulates myogenic differentiation 

Given that P/CAF potentiates MyoD-directed transcription, the ability of P/CAF 
to assist MyoD in promoting myogenic differentiation was investigated. To this aim, 
C3H10T1/2 fibroblasts were transiently transfected with P/CAF and MyoD expression 
vectors. An expression vector for the green fluorescent protein (GFP) was 
20 co-transfected to identify transfected cells. After incubation in differentiation medium, 
the myogenic conversion of transfected cells was determined by simuhaneous expression 
' of the GFP and the differentiation-specific marker myosin heavy chain (MHC). Forced 
expression of MyoD in fibroblasts caused muscle differentiation in 12% of the 
transfected fibroblasts. This myogenic conversion was 20% by co-expressing MyoD and 
25 P/CAF. As observed in transcription experiments, stimulation of differentiation by 

P/CAF was counteracted by co-transfection with the p300 dominant negative mutant, 
p300 (1869-2414). Consistent wnth a general role for coactivators, overexpression of 
P/CAF alone was unable to differentiate fibroblasts. 



Similar experiments were done using proliferating C2C12 myoblasts in which the 
differentiation program is already committed. Most of the myoblasts differentiated into 
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myotubes by overexpressing P/CAF, whereas only a modest effect was observed by 
overexpressing p300. In contrast, differentiation was inhibited slightly by overexpressing 
c-Jun. This inhibitory effect presumably was caused by titration of p300/CBP, which 
associates directly with c-Jun (74). A similar inhibition was observed in the p300 
5 dominant negative mutant. Consistent with the transcriptional effect, El A almost 
completely inhibited differentiation. The El A mutant RG2, lacking p300/CBP-binding 
capability but retaining the retinoblastoma protein (Rb)-binding capability, only partially 
inhibited differentiation, although this same mutant 

inhibited transcription as severely as the wild type. Taken together, these data show that 
10 P/CAF stimulates muscle differentiation by coactivating MyoD function via p300/CBP 
association. 

P/CAF is essential for myogenic transcription and differentiation 

To test the necessity of P/CAF for myogenic transcription, experiments were 
1 5 conducted whereby P/CAF synthesis was inhibited by expressing antisense P/CAF RN A. 
A vector from which the P/CAF mRNA is transcribed in the antisense orientation 
(P/CAF- AS) was transfected with P/CAF and MyoD expression vectors into fibroblasts 
and MyoD-dependent transcription was examined. Cotransfection of the antisense 
expression vector strongly inhibited MyoD-dependent transcription below the level of 
20 induction elucidated by MyoD alone, demonstrating that expression of P/CAF antisense 
RNA inhibits not only the coactivation exerted by exogenous P/CAF but also that of 
endogenous P/CAF. These results indicate that P/CAF is essential for MyoD-dependent 
transcription. 

25 Studies were also carried out to determine whether expression of P/CAF 

antisense RNA inhibits myogenic differentiation. C3H10TI/2 fibroblasts were transiently 
transfected with various expression vectors with or without the P/CAF antisense RNA 
expression vector. Expression of P/CAF antisense RNA reduced MyoD-mediated 
myogenic conversion of fibroblasts. Expression of P/CAF antisense RNA also 

30 counteracted the stimulatory effect of both P/CAF and p300 on myogenic 

differentiation. These data support the view that P/CAF and p300/CBP coactivate 
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MyoD-dependent transcription in the same pathway. More drastic inhibition was 
observed in C2C12 myoblasts in similar experiments. Therefore, it can be concluded that 
P/CAF is essential for transcription of muscle specific genes and hence differentiation 
into myotubes. 

5 

To further confirm the essential role of P/CAF for myogenic differentiation, we 
blockage experiments by antibody microinjection were performed. Antibodies were 
injected into the cytoplasm of proliferating C2C12 myoblasts to prevent the nuclear 
transport of newly synthesized target proteins. Afler incubating in the differentiation 
10 medium, the degree of differentiation was determined. Microinjection of an anti-P/CAF 
antibody almost completely inhibited differentiation. Similar results were obtained by 
microinjecting anti-p300/CBP antibodies. Although microinjection of either 
anti-p300/CBP or P/CAF antibody was sufficient to inhibit differentiation, an even 
greater inhibition was observed by coinjecting both of them. Microinjection of 
1 5 anti-P/CAF or anti-p300/CBP antibody did not interfere with induction of p53 by DN A 
damaging agents, showing specificity of the inhibition by the antibodies. In contrast to 
anti-P/CAF or anti-p300/CBP antibodies, the injection of anti-MyoD antibody only 
partially inhibited differentiation, supporting the view of functional redundancy between 
MyoD and Myf-5 (75,76). Injection of anti-c-Jun antibody or control antibody did not 
20 interfere with muscle differentiation. 

Similar experiments were performed with C3H10T1/2 fibroblasts stably 
expressing MyoD. In these cells, either anti-p300/CBP or anti-P/CAF antibody 
completely inhibited muscle differentiation. In contrast to myoblasts, anti-MyoD 
25 antibody completely blocked differentiation in the fibroblasts expressing MyoD. 

Anti-c-Jun and control antibodies did not interfere with differentiation. Taken together, 
these results demonstrate that P/CAF and p300/CBP are indispensable for activation of 
the myogenic program. 



30 
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p300/CBP, P/CAF and MyoD form a multimeric complex in vivo 

The data described above indicate that P/CAF stimulates MyoD-directed 
transcription via association with p300/CBP. Thus, experiments were conducted to 
investigate whether P/CAF, p300/CBP and MyoD could associate in a complex. 
5 First, cellular extracts derived from C2C12 myotubes were subjected to 

immunoprecipitation. Both anti-MyoD and anti-p300/CBP antibodies co-precipitated 
P/CAF. In a complementary experiment, both anti-p300/CBP and anti-P/CAF 
antibodies also co-precipitated MyoD, suggesting that these factors form a multimeric 
protein complex in myotubes 

10 

Next, attempts were made to detect this complex on the E-box, the DNA 
binding site for MyoD. Immobilized DNA containing an E-box sequence was incubated 
with myotube extracts. After extensive washing, P/CAF, p300/CBP and MyoD were 
analyzed by immunoblotting. P/CAF, p300/CBP and MyoD were all affinity purified on 
1 5 the immobilized DNA, whereas they were not purified on the control DNA lacking the 
E-box Given that P/CAF and p300/CBP per se cannot bind to DNA, these observations 
indicate that P/CAF and p300/CBP are recruited through MyoD at the E-box sites to 
form a multi-protein complex. 

20 Complex formation is inhibited by viral transforming factors 

Since the oncoviral proteins El A and large T antigen inhibit myogenic 
transcription and differentiation, the effect of these factors on the formation of 
complexes on the E-box was tested. Importantly, very small amouts of P/CAF and 
p300/CBP ^yere co-purified on the E-box fi-om myocyte extracts which stably express 
25 El A or large T antigen, although MyoD was detected under these conditions. The lower 
recovery of MyoD from E 1 A-expressing muscle cells could reflect the low level of 
MyoD in the extracts (66). These results indicate that El A and large T antigen 
dissociate P/CAF and p300/CBP from MyoD without altering MyoD binding to DNA 

30 Consistent with the previous observations that transiently expressed El A 

prevents interaction between P/CAF and p300/CBP in vivo (43), the association 



wo 98/03652 




PCT/US97/ 12877 



between p300/CBP and P/CAF was abolished in myoblasts stably transformed by wild 
type El A but not in those clones transformed with the El A mutant R2G unable to bind 
p300/CBP. Similarly, the interaction between p300/CBP and P/CAF was abolished by 
large T antigen but not by the mutant protein that localizes into the cytoplasm (77). 

5 

Interaction between MyoD, P/CAF and CBP in vitro 

Previous interaction experiments in vitro indicate that the CBP region spanning 
residues 1801 to 1850 is crucial for interaction with both P/CAF and El A (43). While 
most sequence-specific factors bind to CBP sites distinct from the P/CAF/El A binding 

10 sites, MyoD interacts with an overlapping CBP fragment called the CH3 region 

(60,64.65). To understand how P/CAF, p300/CBP and MyoD associate, the CBP sites 
important for MyoD binding were mapped more precisely. Consistent with previous 
reports (60,64,65), the CBP fragment spanning residues 1801-2000 (fragment B) bound 
MyoD. Moreover, deletion of residues 1801 to 1850 within fragment B completely 

15 abolished interaction with MyoD, which is similar to the results obtained with P/CAF 
and El A. Importantly, an internal deletion of residues 1850-1878 abolished the MyoD 
interaction with CBP, while it did not affect binding of El A or P/CAF (43). These 
results suggest that MyoD and P/CAF bind to distinct sites of p300/CBP, albeit the 
Finding sites may overlap. Moreover, a direct interaction was observed between MyoD 

20 and P/CAF, which may contribute to stabilization of the multimeric complex. 

These data show that El A prevents not only p300/CBP-interaction with 
P/CAF but also that with MyoD in vivo. To obtain evidence that this 
inhibition is due to the direct action by El A, competition experiments were performed 
25 in vitro. Importantly, the interaction between CBP and MyoD was strongly inhibited by 
addition of El A, implicating that El A inhibits myogenic transcription by disrupting 
multiple interactions. 



30 



Although the present process has been described with reference to specific 
details of certain embodiments thereof, it is not intended that such details should be 



wo 98/03652 




PCT/US97/12877 



I ■ ■ 

65 

regarded as limitations upon the scope of the invention except as and to the extent that 
they are included in the accompanying claims. 



Throughout this application various publications are referenced by numbers 
5 within parentheses. Full citations for these publications are as follows. The disclosures 
of these publications in their entireties are hereby incorporated by reference into this 
application in order to more fully describe the state of the art to which this invention 
pertains. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION. 

(i) APPLICANT: The United States of America, as repesented by the 
Secretary, Department of Health and Human Services, c/o 
National Institutes of Health, Office of Technology Transfer, 
6011 Executive Boulevard, Suite 325, Rockville, Maryland 20842 

(li) TITLE OF THE INVENTION: METHODS AND COMPOSITIONS FOR 



p300/CBP-ASSOClATED TRANSCRIPTIONAL CO-FACTOR P/CAF 



(iii) NUMBER OF SEQUENCES: 18 

(iv> CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NEEDLE i ROSENBERG, P.C. 

(B) STREET: Suite 1200, 127 Peachtree Street, NE 

(C) CITY: Atlanta 

(D) STATE: GA 

(E) COUNTRY: USA 

(F) ZIP: 30303 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 23-JUL-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: Corresponding U.S. Serial No. 

(B) FILING DATE: 23-July-1996 



(viii) ATTORNEY /AGENT INFORMATION: 
(A) NAME: Miller, Mary L 

(Bl REGISTRATION NUMBER: 39,303 

(C) REFERENCE /DOCKET NUMBER: 14014. 0238/P 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 404/688-0770 

(B) TELEFAX: 404/688-9880 

(C) TELEX: 



(2 ). INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 832 amino acids 

(B) TYPE: amino acid 

to STRANDEDNESS : single 
(P) TOPOLOGY: linear. 



(ii) MOLECULE TYPE: None 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



Met 


Ser 


Glu 


AJ.a 


(ji y 


vji y 


Ala 


Gly 


Pro 


\jx y 


(jX y 


uys 






Giy 


Al ^ 


1 








c 
0 










X u 










j. ~/ 




Giy 


Ala 


Giy 


Aia 




Pro 


Gly Ala 


Leu 


Pro 


Pro 


(Z,^ n 


tr JLw 


AT Pi 


A 1 A 


Xit^ LI 






20 










2 5 










J u 






Pro 


pro 


Ala 


Pro 


P ro 


Gin 


Gly 


Ser 


Pro 


Cys 


PuL a 


M-La 


a 


-Mj. a 


GX y 


GX y 






35 










'4 0 










A P. 
H 0 








Ser 


Gly 


Ala 


Cys 


Gly 


Pro 


Aia 


Thr 


Ala 


Val 


Al a 


AJ. a 


AX a 


fit/ 

GX y 


i nr 


AX a 




50 










55 










oU 










Glu 


Gly 


Pro 


Gly 


Gly 


Gl y 


Giy 


Ser 


Ala 


Arg 


He 


AJ. a 


V 3.1. 


Lys 


Lys 


AJ.a 


65 










/ U 










i 0 










O U 


Gin 


Leu 


Arg 


Ser 


Aia 


Pro 


Arg 


Ala 


Lys 


Lys 


Leu 


GXU 


LtyS 


Leu 


GX y 


V ax 










a 5 










ft A 

90 














Tyr 


Ser 


Ala 


Cys 


Lys 


A±a 


Glu 


Glu 


ber 


Cys 


Lys 


Cys 


As n 


ox y 


Trp 


Lys 






100 










105 










Tin 
1 X u 






Asn 


Pro 


Asn 


Pro 


Ser 


fro 


Thr 


Pro 


Pro 


Arg , 


Ax a 


Asp 


Le u 


OX 1 1 


V7X 11 


1 IG 






115 










120 










X<^ 3 








lie 


Val 


Ser 


Leu 


Thr 


GiU 


Ser 


Cys 


Arg 


ber 


Cys 


be r 


Hie 

n J. s 


MX d 


Leu 


AX a 




130 










135 




















Ala 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 


Glu 


GIU 


GX U 


rie L. 


Asn 


Arg 


145 










-yen. 

loO 










i O D 










X o u 


Leu 


Leu 


Gly 


He 


Val 


Leu 


Asp 


Val 


GiU 


Tyr 


Leu 


fne 


1 n r 


Cys 


V ax 


rix s 








165 










1 /U 










1/3 




Lys 


Glu 


Glu 


Asp 


Aia 


Asp 


Thr 


Lys 


Gin 


vai 


Tyr 


irne 


Tyr 


Leu 


IT lie 


Lys 






180 










loO 






■ 




X y u 






Leu 


Leu 


Arg 


Lys 


Ser 


He 


Leu 


Gin 


Arg 


Gly 


Lys 


Pro 


V a X 


V a X 


oX u 


vjx y 






195 










200 










O rk Q 
^ U D 








ser 


Leu 


Glu 


Lys 


Lys 


Pro 


Pro 


Phe 


Glu 


Lys 


Pro 


Ser 


X X e 


GX U 


GX n 


G-Ly 




210 










215 










O O A 










Val 


Asn 


Asn 


Phe 


Val 


Gin 


Tyr 


Lys 


Phe 


Ser 


His 


Leu 


Pro 


AX a 


Lys 


GXU 


225 










2 30 










23 5 










Z 4 U 


Arg 


Gin 


Thr 


He 


Val 


Glu 


Leu 


Ala 


Lys 


Met 


Phe 


Leu 


Asn 


Arg 


T 1 

X X e 


As n 








2 45 










250 














Tyr 


Trp 


His 


Leu 


Glu 


Ala 


Pro 


Ser 


Gin 


Arg 


Arg 


Leu 


Arg 


Ser 


Pro 


Asn 




260 










265 










27 0 






Asp 


Asp 


lie 


Ser 


Gly 


Tyr 


Lys 


Glu 


Asn 


Tyr 


Thr 


Arg 


Trp 


Leu 


Cys 


Tyr 






275 










280 










2 85 








Cys 


Asn 


Val 


Pro 


Gin 


Phe 


Cys 


Asp 


Ser 


Leu 


Pro 


Arg 


Tyr 


Glu 


Thr 


Thr 


290 










295 










300 










Gin 


Val 


Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 


Val 


Phe 


Thr 


Val 


Met 


Arg 


305 










310 










315 










32 0 


Arg 


Gin 


Leu 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Glu 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 








325 










330 










3 35 




Glu 


Lys 


Arg 


Thr 


Leu 


lie 


Leu 


Thr 


His 


Phe 


Pro 


Lys 


Pne 


Leu 


Ser 


Met 






340 










345 










OCA 






Leu 


Glu 


Glu 


Glu 


Val 


Tyr 


Ser 


Gin 


Asn 


Ser 


Pre 


lie 


Trp 


Asp 


GXn 


Asp 






3 55 










360 










3 o5 








Phe 


Leu 


Ser 


Ala 


Ser 


Se r 


Arg 


Thr 


Ser 


Gin 


Leu 


Gly 


He 


Gin 


Thr 


Vai 




370 










375 










380 










lie 


Asn 


Pro 


Pro 


Pro 


va. JL 


Aia 


Giy 


Thr 


i X e 


be r 


Tyr 


As n 


Q o r- 

o e r 


1 n r 


Cam- 
ber 


385 










390 










395 










400 


Ser 


Ser 


Leu 


Glu 


Gin 


Pro 


Asn 


Ala 


Gly 


Ser 


Ser 


Ser 


Pro 


Aia 


Cys 


Lys 










405 










410 










415 




Ala 


Ser 


Ser 


Gly 


Leu 


Glu 


Ala 


Asn 


Pro 


Gly 


Glu 


Lys 


Arg 


Lys 


Met 


Thr 








420 










425 










430 






Asp 


Ser 


His 


Val 


Leu 


Glu 


Glu 


Ala 


Lys 


Lys 


Pro 


Arg 


Val' 


Met 


Giy 


Asp 






435 










440 










445 








lie 


Pro 


Met 


Glu 


Leu 


He 


Asn 


Glu 


Val 


Met 


Ser 


Thr 


He 


Thr 


Asp 


Pro 




450 










455 










460 










Ala 


Ala 


Met 


Leu 


Gly 


Pro 


Glu 


Thr 


Asn 


Phe 


Leu 


Ser 


Ala 


His 


Ser 


Ala 



465 470 475 480 
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Ara Asp Giu Ala Ala Arg Leu Glu Glu Arg Arg Gly Vai lie Glu Phe 

485 490 495 

His Val Val Gly Asn Ser Leu Asn Gin Lys Pro Asn Lys Lys lie Leu 

500 505 510 

Met Trp Leu Val Gly Leu Gin Asn Val Phe Ser His Gin Leu Pro Arg 

515 520 525 

Met Pro Lys Glu Tyr lie Thr Arg Leu Val Phe Asp Pro Lys His Lys 

530 535 540 

Thr Leu Ala Leu He Lys Asp Gly Arg Val He Gly Gly He Cys Phe 
545 550 555 560 

Ara Met Phe Pro Ser Gin Gly Phe Thr Glu He Val Phe Cys Ala Val . 

565 570 , 575 

Thr Ser Asn Giu Gin Val Lys Gly Tyr Gly Thr His Leu Met Asn His 

580 585 . 590 

Leu Lys Glu Tyr His He Lys His Asp He Leu Asn Phe Leu Thr Tyr 

595 600 605 

Ala Asp Giu Tyr Ala He Gly Tyr Phe Lys Lys Gin Gly Phe Ser Lys 

610 615 620 

Giu He Lys He Pro Lys Thr Lys Tyr Val Gly Tyr He Lys Asp Tyr 
625 630 635 640 

Glu Gly Ala Thr Leu Met Gly Cys Giu Leu Asn Pro Arg He Pro Tyr 

645 650 655 

Thr Glu Phe Ser Vai He He Lys Lys Gin Lys Giu He He Lys Lys 

660 665 670 

Leu He Glu Arg Lys Gin Ala Gin He Arg Lys Val Tyr Pro Gly Leu 

675 680 685 

Ser Cys Phe Lys Asp Gly Val Arg Gin He Pro He Glu Ser He Pro 

690 695 700 

Giv He Arg Giu Thr Gly Trp Lys Pro Ser Gly Lys Giu Lys Ser Lys 
705 '710 715 720 

Giu Pro Arg Asp Pro Asp Gin Leu Tyr Ser Thr Leu Lys Ser He Leu 

725 730 735 

Gin Gin Val Lys Ser His Gin Ser Ala Trp Pro Phe Met Glu Pro Val 

740 745 750 

Lvs Arq Thr Giu Ala Pro Gly Tyr Tyr Giu Vai He Arg Ser Pro Met 

755 760 765 

Asp Leu Lys Thr Met Ser Glu Arg Leu Lys Asn Arg Tyr Tyr Vai Ser 

770 775 780 

Lvs Lys Leu Phe Met Ala Asp Leu Gin Arg Vai Phe Thr Asn Cys Lys 
785 790 .795 , 800 

GT u Tyr Asn Ala Pro Giu Ser Glu Tyr Tyr Lys Cys Ala Asn He Leu 

805 ' 810 815 

Glu Lys Phe Phe Phe Ser Lys He Lys Glu Ala Gly Leu He Asp Lys 
820 • • 825 830 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 481 anu.no acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Leu Glu Glu Glu Val Tyr Ser Gin Asn Ser Pro lie Trp Asp Gin 

1 5 10 15 

Asp Phe Leu Ser Ala Ser Ser Arg Thr Ser Gin Leu Gly He Gin Thr 

20 25 . ^ 30 

Vai He Asn Pro Pro Pro Val Ala Giy Thr He Ser Tyr Asn Ser Thr 

35 40 45 
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Ser 


Ser 


Ser 


Leu 




50 






Lys 


;u.a 


Ser 


Ser 


65 








Thr 


Asp 


Ser 


His 


Asp 


lie 


Pro 


Met 








100 


Pro 


Aia 


Ala 


Met 






115 




Ala 


Arg 


Asp 


Glu 




130' 






Phe 


His 


Val 


Val 


145 








Leu 


Met 


Trp 


. Leu 


Ar g" 


Met 


Pro 


Lvs 








180 


Lys 


Thr 


Leu 


Ala 






195 




Phe 


Arg 


Met 


Phe 




210 






Val 


Thr 


Ser 


Asn 


225 








His 


Leu 


Lys 


Giu 


T vr 


Ala 


Asp 


Giu 








2 60 


Lys 


Glu 


lie 


Lys 






275 




Tyr 


Giu 


Gly 


Aia 


2 90 






Tyr 


Thr. 


Glu 


Phe 


305 








Lys 


Leu 


lie 


Glu 


Leu 


Ser 


Cvs 


Phe 






340 


Pro 


Gly. 


He 


Arg 






355 




Lys 


Giu 


Pro 


Arg 




370 






Leu 


Gin 


Gin 


Val 


385 








Val 


Lys 


Arg 


Thr 


Met 


Asp 


Leu 


Lys 








420 


Ser 


Lys 


Lys 


Leu 






435 




Lys 


Glu 


Tyr 


Asn 




450 






Leu 


Glu 


Lys 


Phe 



465 
Lys 



Glu 


Gin 


Pro 


Asn 






55 




.Gly 


Leu 


Glu 


Ala 




70 






Val 


Leu 


Glu 


Glu 


85 








Glu 


Leu 


He 


Asn 


Leu 


Gly 


Pro 


Glu 








120 


Aia 


Ala 


Arg 


Leu 






135 




Gly 


Asn 


Ser 


Leu 


150 






Val 


Gly 


Leu 


Gin 


165 








Giu 


Tyr 


He 


Thr 


Leu 


He 


Lys 


Asp 








200 


Pro 


Ser 


Gin 


Gly 






215 




Giu 


Gin 


Val 


Lys 




230 






Tyr 


His 


He 


Lys 


245 








Tyr 


Aia 


He 


Gly 


He 


Pro 


Lvs 


Thr 








280 


Thr 


Leu 


Met 


Gly 






295 




Ser 


Val 


lie 


He 




310 






Arg 


Lys 


Gin 


Ala 


325 








Lys 


Asp 


Gly 


Val 


Glu 


Thr 


Gl V 


Tro 








360 


Asp 


Pro 


Asp 


Gin 






375 




Lys 


Ser 


His 


Gin 




390 






Giu 


Ala 


Pro 


Gly 


405 








Thr 


Met 


Ser 


Glu 


Phe 


Met 


Ala 


Asp 








440 


Aia 


Pro 


Glu 


Ser 






455 




Phe 


Phe 


Ser 


Lys 




470 







73 



Ala 


Gly 


Ser 


Ser 








60 


Asn 


Pro 


Gly 


Glu 






7 5 




Ala 


Lys 


Lys 


Pro 




90 






Glu 


Val 


Met 


Ser 


105 








Thr 


Asn 


Phe 


Leu 


Giu 


Giu 


Arg 


Arg 








140 


Asn 


Gin 


Lys 


Pro 






155 




Asn 


Val 


Phe 


Ser 




170 






Arg 


Leu 


Val 


Phe 


185 








Gly 


Arg 


Val 


He 


Phe 


Thr 


Giu 


He 








220 


Gly 


Tyr 


Gly 


Thr 






235 




His 


Asp 


He 


Leu 




250 






Tyr 


Phe 


Lys 


Lys 


265 








Lys 


Tyr 


Val 


Gly 


Cys 


Glu 


Leu 


Asn 








300 


Lys 


Lys 


Gin 


Lys 






315 




Gin 


He 


Arg 


Lys 




330 






Arg 


Gin 


He 


Pro 


345 








Lys 


Pro 


Ser 


Gly 


Leu 


Tvr 


Ser 


Thr 








380 


Ser 


Aia 


Trp 


Pro 






395 




Tyr 


Tyr 


Giu 


Val 




410 






Arg 


Leu 


Lys 


Asn 


425 








Leu 


Gin 


Arg 


Val 


Giu 


Tyr 


Tyr 


Lys 








460 


He 


Lys 


Glu 


Ala 






475 





Ser 


Pro 


iW, a 


Cys 


Lys 


Arg 


Lys 


Met 








80 


Arg 




ne c 


(j1 y 






y J 




Thr 


He 


Thr 


Asp 




110 






Ser 


Ala 


Hi s 


Ser 


125 








Gly 


Val 


He 


Giu 


Asn 


Lys 


Lys 


He 








160 


His 


Gin 


Leu 


Pro 






175 




Asp 


Pro 


Lys 


His 




190 






Gly 


Gly 


He 


Cys 


205 








Val 


Phe 


Cys 


Aia 


His 


Leu 


Met 


Asn 








240 


Asn 


Phe 


Leu 


Thr 






255 




Gin 


Gly 


Phe 


Ser 




270 






Tyr 


He 


Lys 


Asp 


285 








Pro 


Arg 


He 


Pro 


Giu 


He 


He 


Lys 








320 


Val 


Tyr 


Pro 


Gly 






335 




He 


Giu 


Ser 


He 




350 






Lys 


Giu 


Lys 


Ser 


365 








Leu 


Lys 


Ser 


He 


Phe 


Met 


Giu 


Pro 








400 


He 


Arg 


Ser 


Pro 






415 




Arg 


Tyr 


Tyr 


Val 




430 






Phe 


Thr 


Asn 


Cys 


445 








Cys 


Aia 


Asn 


He 


Gly 


Leu 


He 


Asp 








480 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 203 aniino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Arq Vai Vai Gin His Thr Lys Giy Cys Lys Arg Lys Thr Asn Giy Giy 

1 5' • 10' 1^ 

CVS Pro lie Cys Lys Gin Leu He Ala Leu Cys Cys Tyr His Ala Lys 

^ 20 25 30 

His Cys Gin Glu Asn Lys Cys Pro Vai Pro Phe Cys Leu Asn lie Lys 

35 40 . 45 

Gin Lys Leu Arg Gin Gin Gin Leu Gin His Arg Leu Gin Gin Ala Gin 

50 55 60 

Met Leu Arg Arg Arg Met Ala Ser Met Arg Thr Giy Vai Vai Giy Gin 
65 '70 75 80 

Gin Gin Giy Leu Pro Ser Pro Thr Pro Ala Thr Pro Thr Thr Pro Thr 

85 . 90 . 95 

Giv Gin Gin Pro Thr Thr Pro Gin Thr Pro Gin Pro Thr Ser Gin Pro 

100 . 105 110 

Gin Pro Thr Pro Pro Asn Ser Met Pro Pro Tyr Leu Pro Arg Thr Gin 

115 120 125 

Ala Ala Giy Pro Vai Ser Gin Giy Lys Ala Ala Giy Gin Vai Thr Pro 

130 135 140 

Pro Thr Pro Pro Gin Thr Ala Gin Pro Pro Leu Pro Giy Pro Pro Pro 
143 150 155 160 

Thr Ala Vai Glu Met Ala Met Gin lie Gin Arg Ala Ala Glu Thr Gin 

165 170 175 

Ara Gin Met Ala His Vai Gin lie Phe Gin Arg Pro He Gin His Gin 

180 185 190 

Met Pro Pro Met Thr Pro Met Ala Pro Met Giy 
195 200 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 
^ (A) LENGTH: 351 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Ser Glu Ala Giy Giy Ala Giy Pro Giy Giy Cys Giy Ala Giy Ala 

1 5 . ■ 10 15 

Giv Ala Giy Ala Giy Pro Giy Ala Leu Pro Pro Gin Pro Ala Ala Leu 

20 25 30 

Pro Pro Ala Pro Pro Gin Giy Ser Pro Cys Ala Ala Ala Ala Giy Giy 

35 40 45 

Ser Giy Ala Cys Giy Pro Ala Thr Ala Vai Ala Ala Ala Giy Thr Ala 

50 . 55 60 

Glu Giy Pro Giy Giy Giy Giy Ser Ala Arg lie Ala Vai Lys Lys Ala- 
65 ■ 70 75 80 

Gin Leu Arg Ser Ala Pro Arg Ala Lys Lys Leu Glu Lys Leu Giy Vai 

85 90 95 

Tyr ser Ala Cys Lys Ala Glu Glu Ser Cys 'Lys Cys Asn' Giy Trp Lys 

100 105 . 110 

Asn Pro Asn Pro Ser Pro Thr Pro Pro Arg Ala Asp Leu Gin Gin lie 

115, 120 125 

lie Vai Ser Leu Thr Glu Ser Cys Arg Ser Cys Ser His Ala Leu Ala 
130 135 140 
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Ala 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


va 1 


Ser 


Glu 


Glu 


Glu 


Met 


As n 


Arg 


145 










150 










iDO 










1 bU 


Leu 


Leu 


Gly 


lie 


Val 


Leu 


Asp 


Vai 


Glu 


Tyr 


Leu 


Pne 


Thr 


Cys 


V a ± 


Hi S 










1 DO 










1 / (J 










I/O 




Lys 


Glu 


Glu 


Asp 


Aia 


Asp 


Thr 


Lys 


Gin 


Val 


Tyr 


Pne 


Tyr 


Leu 


rne 


Lys 






180 










185 










1 y U 






Leu 


Leu 


Arg 


Lys 


Ser 


lie 


Leu 


Gin 


Arg 


Gly 


Lys 


Pro 


Vai 


val 


Glu 


Gly 






195 










200 










2 05 








Ser 


Leu 


Glu 


Lys 


Lys 


Pro 


Pro 


f-\ v-v 

Pne 


Glu 


Lys 


Pro 


Ser 


lie 


tjJ.U 


G-Ln 


Gl y 




210 










215 










22 0 










Vai 


Asn 


Asn 


Phe 


Val 


Gin 


Tyr 


Lys 


Phe 


Ser 


His 


Leu 


Pro 


Ala 


Lys 


Gl u 


225 










230 










235 










2 40 


Arg 


Gin 


Thr 


lie 


Val 


Glu 


Leu 


Ala 


Lys 


Met 


Phe 


Leu 


Asn 


Arg 


lie 


Asn 








245 










250 










255 




Tyr 


Trp 


His 


Leu 


Glu 


Ala 


Pro 


Ser 


Gin 


Arg 


Arg 


Leu 


Arg 


Ser 


Pro 


Asn 




260 










265 










270 






Asp 


Asp 


1 xe 


ber 


\jx y 


Tyr 


Lys 


^JL u 


Asn 


Tyr 


1 nr 


Arg 


I rp 




^ y a 


i yr 




275 










280 










285 








Cys 


TVsn 


Val 


Pro 


Gin 


Phe 


Cys 


Asp 


Ser 


Leu 


Pro 


Arg 


Tyr 


Glu 


Thr 


Thr 


290 










295 










300 










Gin 


Val 


Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 


Val 


Phe 


Thr 


Val 


Met 


Arg 


305 










310 










315 










320 


Arg . 


Gin 


Leu 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Glu 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 








325 










330 










335 




Glu 


Lys 


Arg 


Thr 


Leu 


lie 


Leu 


Thr 


His 


Phe 


Pro 


Lys 


Phe 


Leu 


Ser 





340 345 350 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 476 amino acids 
( B } TYPE : amino acid 

(C) 5TRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



Met 


Leu 


Glu 


Glu 


Glu 


lie 


Tyr 


Gly 


Ala 


Asn 


Ser 


Pro 


He 


Trp 


Glu 


Ser 


1 








5 










10 










15 




Gly 


Phe 


Thr 


Met 


Pro 


Pro 


Ser 


Glu 


Gly 


Thr 


Gin 


Leu 


Val 


Pro 


Arg 


Pro 






20 










25 










30 






Ala 


Ser 


Val 


Ser 


Ala 


Ala 


Vai 


Val 


Pro 


Ser 


Thr 


Pro 


He 


Phe 


Ser 


Pro 






35 










40 










45 








Ser 


Met 


Gly 


Gly 


Gly 


Ser 


Asn 


Ser 


Ser 


Leu 


Ser 


Leu 


Asp 


Ser 


Ala 


Gly 




50 








55 










60 










Ala 


Glu 


Pro 


Met 


Pro 


Gly 


Glu 


Lys 


Arg 


Thr 


Leu 


Pro 


Glu 


Asn 


Leu 


Thr 


65 










70 










75 










80 


Leu 


Glu 


Asp 


Ala 


Lys 


Arg 


Leu 


Arg 


Val 


Met 


Gly 


Asp 


He 


Pro 


Met 


Glu 










85 










90 










95 




Leu 


Val 


Asn 


Glu 


Val 


Met 


Leu 


Thr 


lie 


Thr 


Asp 


Pro 


Ala 


Ala 


Met 


Leu 








lOO 










105 










110 






Gly 


Pro 


Glu 


Thr 


Ser 


Leu 


Leu 


Ser 


Ala 


Asn 


Ala 


Ala 


Arg 


Asp 


Glu 


Thr 




115 










120 










125 








Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


Gly 


lie 


He 


Glu 


Phe 


His 


Val 


He 


Gly 




130 










135 










140 










Asn 


Ser 


Leu 


Thr 


Pro 


Lys 


Aia 


Asn 


Arg 


Arg 


Val 


Leu 


Leu 


Trp 


Leu 


Val 


145 










150 










155 










160 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 


Glu 








165 










170 










175 




Tyr 


lie 


Ala 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 


Lys 


Thr 


Leu 


Ala 


Leu 



180 185 • 190 
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lie Lys Asp Giy Arg Val lie Giy Gly lie Cys Phe Arg Met Phe Pro 

.195 200 205 

Thr Gin Giy Phe Thr Giu lie Val Phe Cys Ala Val Thr Ser Asn Glu 

210 215 220 

Gin val Lys Giy Tyr Giy Thr His Leu Met Asn His Leu Lys Giu Tyr 
225 -230 235 240 

His lie Lys His Asn lie Leu Tyr Phe Leu Thr Tyr Ala Asp Glu Tyr 

245 250 255. 

Ala I^e Gly Tyr Phe Lys Lys Gin Giy Phe Ser Lys Asp lie Lys Val 

260 265 270 

Pro Lys Ser Arg Tyr Leu Giy Tyr lie Lys Asp Tyr Giu Giy Ala Thr 

275 280 285 

Leu Met Giu Cys Giu Leu Asn Pro Arg lie Pro Tyr Thr Giu Leu Ser 

290 . • 295 300 

His lie lie Lys Lys Gin Lys Giu lie lie Lys Lys Leu lie Giu Arg 
305 310 315 320 

Lvs Gin Ala Gin lie Arg Lys Val Tyr Pro Gly Leu Ser Cys Phe Lys' 

^ 325 330 335 

Giu Gly Val Arg Gin lie Pro Val Giu Ser Val Pro Giy lie Arg Giu 

340 345 350 

Th- Gly Trp Lys Pro Leu Giy Lys Glu Lys Gly Lys Glu Leu Lys Asp 

355 360 . 365 

Pro Asp Gin Leu Tyr Thr Thr Leu Lys Asn Leu Leu Ala Gin lie Lys 

370 375 380 

Ser His Pro Ser Aia Trp Pro Phe Met Giu Pro Val Lys Lys Ser Giu 
385 390 395 400 

Ala Pro Asp Tyr Tyr Giu Val lie Arg Phe Pro He Asp Leu Lys Thr 

405 410 415 

Met Thr Giu Arg Leu Arg Ser Arg Tyr Tyr Val Thr Arg Lys Leu Phe 

420 425 430 

Val Ala Asp Leu Gin Arg Val lie Aia Asn Cys Arg Glu Tyr Asn Pro 

435 440 445 

Pro Asp Ser Giu Tyr Cys Arg Cys Ala Ser Ala Leu Glu Lys Phe Phe 

450 455 460 

Tyr Phe Lys Leu Lys Giu Gly Giy Leu He Asp Lys 
465 470 475 

(2 1 INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2414 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



Met 


Aia 


Giu 


Asn 


Val 


Val 


Glu 


Pro 


Gly 


Pro ■ 


Pro 


Ser 


Ala 


Lys 


Arg 


Pro 


1 








5 










10 










15 




Lys 


Leu 


Ser 


Ser 


Pro 


Aia 


Leu 


Ser 


Aia 


Ser 


Aia 


Ser 


Asp 


Giy 


Thr 


Asp 






20 










25 










30 






Phe 


Gly 


Ser 


Leu 


Phe 


Asp 


Leu 


Giu 


His 


Asp 


Leu 


Pro 


Asp 


Giu 


Leu 


lie 




35 










40 










45 








Asn 


Ser 


Thr 


Giu 


Leu 


Giy 


Leu 


Thr 


Asn 


•Gly 


Gly 


Asp 


lie 


Asn 


Gin 


Leu 


50 








55 










60 








Gin 


Gin 


Thr 


■Ser 


Leu 


Gly 


Met 


Vai 


Gin 


Asp 


Aia 


Aia 


Ser 


Lys 


His 


Lys 


65 








70 










75 










80 


Leu 


Ser 


Giu 


- Leu 


Leu 


Arg 


Ser 


Giy 


Ser 


Ser 


Pro 


Asn 


Leu 


Asn 


Met 


Giy 








8 5. 










90 










95 




vai 


Giy 


Giy 


Pro 


Gly 


Gin 


Vai 


Met 


Aia 


Ser 


Gin 


Ala 


Gin 


Gin 


Ser 


Ser 




100 










105 










110 
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Pro 


Gly 


Leu 


(j1 y 


Leu 


T 1 <a 

J. X e 


Asn 


oe r 


rie u 






115 










l^u 




Ala 


Gly . 


Leu 


i n r 


Ser 


Pro 


As n 


iTie u 


\9± y 




130 










1 J D 






Gin 


Gly 


Pro 


Thr 


CaJLn 


oe r 


i nr 


Cjri y 


riec 


145 
















Pro 


Ala 


Met 


Gly 


Met 


Asn 


Thr 


Gly 


Thr 










lit 
loo 










Met 


Leu 


Ala 


Ala 


Gly 


Asn 


Gl y 


Gin 


Gly 








180 










185 


Asn 


Gly 


Ser 


lie 


Gly 


Ala 


Gly 


Arg 


Gly 






195 










200 




Asn 


Pro. 


Gly 


Met 


Gly 


Ser 


Ala 


Gly 


Asn 




210 










215 






Gin 


Gly 


Ser 


Pro 


Gin 


Met 


Gly 


Gly 


Gin 


225 










230 








Pro 


Leu 


Lys 


Met 


Gly 


Met 


Met 


Asn 


Asn 










24 5' 










Tyr 


Thr 


Gin 


Asn 


Pro 


Gl y 


Gin 


Gin 


lie 






2 60 










O c 


Gin 


I le 


Gin 


Thr 


Lys 


Th r 


Val 


Leu 


Ser 






275 










280 




Met 


Asp 


Lys 


Lys 


Ala 


Val 


Pro 


Gly 


Gl y 




290 










295 






Gin 


Pro 


Ala 


Pro 


Gin 


Val 


Gin 


Gin 


Pro 


305 










310 








Gin 


Gly 


Met 


Gly 


Ser 


Gly 


Ala 


His 


Thr 










325 










Leu 


lie 


Gin 


Gin 


Gin 


Leu 


Val 


Leu 


Leu 








34 0 










3 45 


Arg 


Arg 


Glu 


Gin 


Ala 


Asn 


Gl y 


Glu 


Val 






355 










360 




Cys 


Arg 


Thr 


Met 


Lys 


Asn 


Val 


Leu 


Asn 




370 










37 5 






Gly 


Lys 


Ser 


Cys 


Gin 


Val 


Ala 


His 


Cys 


385 










390 








Ser 


His 


Trp 


Lys 


Asn 


Cys 


Thr 


Arg 


His 










405 










Leu 


Lys 


Asn 


Ala 


Gly 


Asp 


Lys 


Arg 


Asn 








420 










425 


Ala 


Pro 


Val 


Gly 


Leu 


Gly 


Asn 


Pro 


Ser 






435 










440 




Ser 


Ala 


Pro 


Asn 


Leu 


Ser 


Thr 


Val 


Ser 




450 










455 






Glu 


Arg 


Ala 


Tyr 


Ala 


Ala 


Leu 


Gly 


Leu 


465 










47 0 








Pro 


Thr 


Gin 


Pro 


Gin 


Val 


Gin 


Ala 


Lys 










485 










Gly 


Gin 


Ser 


Pro 


Gin 


Gly 


Met 


Arg 


Pro 






500 










505 


Pro 


Mer 


Gly 


Val 


Asn 


Gly 


Gly 


Val 


Gly 






515 










520 




Ser 


Asp 


Ser 


Met 


Leu 


His 


•Ser 


Ala 


lie 




530 










535 






Ser 


Giu 


Asn 


Ala 


Ser 


Val 


Pro 


Ser 


Leu 


545 










550 








Gin 


Pro 


Ser 


Thr 


Thr 


Gly 


lie 


Arg 


Lys 










565 










Gin 


Asp 


Leu 


Arg 


Asn 


His 


Leu 


Val 


His 








580 










585 


Pro 


Thr 


Pro 


Asp 


Pro 


Ala 


Ala 


Leu 


Lys 






595 










600 






PCT/US97/12877 



77 





i^ys 


OCX. 


IT ^ U 


Me t 


Thr 


Gin 






^. ^ -J 










Gl y 


Thr 


Ser 


Gl V 


Pro 


As n 






± H \J 










Mo t- 

rne 1- 


As n 




^ r o 


Va 1 
V dx 


As n 


Gin 














X O VJ 


Asn 


M-L a 


\j± y 


Mia t- 


Asn 


IT X O 


GX y 


i / u 










I/O 




lie 


ne L. 


Pro 


As n 


\jX n 


V ai 


ne u 










1 Q A 

i y U 






Arg 


Ciin 


Asp 


.rie u 


Gl. n 


Tyr 


Pro 








O A C 








Leu 


Leu 


Thr 


Glu 


Pro 


Leu 


Gin 






220 










Tnr 


Gl y 


Leu 


Arg 


G-L y 


r ro 


Gl n 




o "5 e 

23 5 










O /I A 

z 4 U 


Pro 


Asn 


Pro 


Tyr 


Gly 


Ser 


Pro 


250 














y 


M.X a 


o e r 




Leu 


vj X y 


Leu 










9 "7 n 






Asn 


Asn 


Leu 


Ser 


Pro 


rr lie 


ma 

>-VX cL 








"7 p s 
z o ^ 








Gl y 


Me u 


Pro 


As n 


ne L. 


Gx y 


V7X n 
















Gly 


Leu 


vai 


i nr 


Pro 


V ai 


Aia 




315 










"3 O A 
OZ U 


Ala 


Asp 


Pro 


Glu 


Lys 


Arg 


Lys 


3 30 










T Q Ci 
O J 0 




Leu 


n X S 


/^X a. 


nx 5 


Lys 


y o 


k^X i 1 










T =4 A 
J O u 






Arg 


Gl n 


Cys 


As n 


Leu 


Pro 


rll s 








J o o 








Hi s 


Met 


i nr 


nx s 


C ys 


ltx n 


Cor- 
ner 






O Q A 

o a u 










TV 1 

Ala 


Ser 


Ser 


Arg 


Gin 


lie 


lie 




395 










>1 A A 


Asp 


Cys 


Pro 


Val 


Cys 


Leu 


Pro 


410 










415 




Gin 


Gin 


Pro 


lie 


Leu 


Thr 


Gly 










4 30 






Ser 


Leu 


Gly 


Val 


Gly 


Gin 


Gin 








4 4 5 








Gin 


lie 


Asp 


Pro 


Ser 


Ser 


lie 






4 60 










Pro 


Tyr 


Gin 


Val 


Asn 


Gin 


Met 




475 










yi Q A 
4 O U 


Asn 


Gin 


Gin 


Asn 


Gin 


Gin 


Pro 


490 










/IOC 




Met 


Ser 


Asn 


Met 


Ser 


>\_i a 


e a T- 

o e r 










^ 1 A 

3 1 U 






Val 


Gin 


Thr 


Pro 


b e r 


Leu 


Leu 








D ^ D 








Ron 




Gin 


As n 


P r o 


Met 


Me t 






540 










Gly 


Pro 


Met 


Pro 


Thr 


Ala 


Ala 




555 










560 


Gin 


Trp 


His 


Glu 


Asp 


lie 


Thr 


570 










575 




Lys 


Leu 


Val 


Gin 


Ala 


lie 


Phe 










590 






Asp 


Arg 


Arg 


Met 


Glu 


Asn 


Leu 



605 
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Val Ala Tyr Ala Arg Lys Val Glu Gly Asp Met Tyr Glu Ser Ala Asn 

610 615 620 

Asn Arg Ala Glu Tyr Tyr His Leu Leu Ala Glu Lys He Tyr Lys He 
625 630 635 640 

Gin Lys Glu Leu Glu Glu Lys Arg Arg Thr Arg Leu Gin Lys Gin Asn 

645 650 655 

Met Leu Pro" Asn Ala Ala Gly Met Val Pro Val Ser Met Asn Pro Gly" 

660 665 670 

Pro Asn Met Gly Gin Pro Gin Pro Gly Met Thr Ser Asn Gly Pro Leu 

675 6B0 685 

Pro Asp Pro Ser Met He Arg Gly Ser Val Pro Asn Gin Met Met Pro 

690 695 700 

Arg He Thr Pro Gin Ser Gly Leu Asn Gin Phe Gly Gin Met Ser Met 
705 710 . 715 720 

Ala Gin Pro Pro He Val Pro Arg Gin Thr Pro Pro Leu Gin His His 

725 730 735 

Gly Gin Leu Ala Gin Pro Gly Ala Leu Asn Pro Pro Met Gly Tyr Gly 

740 745 750 

Pro Arg Met Gin Gin Pro -Ser Asn Gin Gly Gin Phe Leu Pro' Gin Thr. 

755 760 765- 

Gin Phe Pro Ser Gin Gly Met -Asn Val Thr Asn He Pro Leu Ala Pro 

770 775 • 780 

Ser ser Gly Gin Ala Pro Val Ser Gin Ala Gin Met Ser Ser Ser Ser 
785 790 795 800 

Cys Pro Val Asn Ser Pro He Met Pro Pro Gly Ser Gin Gly Ser His 

805 810 815 

He His Cys Pro Gin Leu Pro Gin Pro Ala Leu His Gin Asn Ser. Pro 

820 825 330 

Ser Pro Val Pro Ser Arg Thr Pro Thr Pro His His Thr Pro Pro Ser 

835 840 845 

He Gly Ala Gin Gin Pro Pro Ala Thr Thr He Pro Ala Pro Val Pro 

850 855- 86C 

Thr Pro Pro Ala Met Pro Pro Gly Pro Gin Ser Gin Ala Leu His Pro 
865 .870 875 880 

Pro Pro Arg Gin Thr Pro Thr Pro Pro Thr Thr Gin Leu Pro Gin Gin 

885 890 895 

Val Gin Pro Ser Leu Pro Ala Ala Pro Ser Ala Asp Gin Pro Gin Gin 

900 905 910 

Gin Pro Arg Ser Gin Gin Ser Thr Ala Ala Ser Val Pro Thr Pro Asn 

915 920 , 925 

Ala Pro Leu Leu Pro Pro Gin Pro Ala Thr Pro Leu Ser Gin Pro Ala 

930. 935 940 

Val Ser He Glu Gly Gin Val Ser Asn Pro Pro Ser Thr ser Ser Thr 
945 950 955 960 

Glu Val Asn Ser Gin Ala He Ala Glu Lys Gin Pro Ser Gin Glu Val 

965 970 975 

Lys Met Glu Ala Lys Met Glu Val Asp Gin Pro Glu Pro Ala Asp Thr 

980 985 990 

Gin Pro Glu Asp He Ser Glu Ser Lys Val Glu Asp Cys Lys Met Glu 

995 . 1000 1005 

Ser Thr Glu Thr Glu Glu Arg Ser Thr Glu Leu Lys Thr Glu He Lys 

1010 -1015 1020 

Glu Glu Glu Asp Gin Pro Ser Thr Ser Ala Thr Gin Ser Ser Pro Ala 
025 1030 1035 1040 

Pro Gly Gin Ser Lys Lys Lys He Phe Lys Pro Glu Glu Leu Arg Gin 

1045 1050 . 1055 

Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg Gin Asp Pro Glu Ser 

1060 1065 1070 

Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu Leu Gly He Pro Asp 

1075 - 1080 1085 . 

Tyr Phe Asp He Val Lys Ser Pro Met Asp Leu Ser Thr He Lys Arg 
1090 1095 1100 
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Lys Leu Asp Thr Giy Gin Tyr Gin Giu Pro Trp Gin Tyr Val Asp Asp 
105 1110 1115 1120 

lie Trp Leu Met Phe Asn Asn Ala Trp Leu Tyr Asn Arg Lys Thr Ser 

1125 1130 1135 

Arg Val Tyr Lys Tyr Cys Ser Lys Leu Ser Giu Vai Phe Glu Gin Giu 

1140 1145 1150 

He Asp Pro Vai Met Gin Ser Leu Giy Tyr Cys Cys Giy Arg Lys Leu 

1155 1160 1165 

Giu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Giy Lys Gin Leu Cys Thr 

1170 1175 1180 

lie Pro Arg Asp Ala Thr Tyr Tyr Ser Tyr Gin Asn Arg Tyr His Phe 
185 1190 1195 1200 

Cys Giu Lys Cys Phe Asn Glu lie Gin Giy Glu Ser Val Ser Leu Giy 

1205 1210 1215 

Asp Asp Pro Ser Gin Pro Gin Thr Thr He Asn Lys Giu Gin Phe Ser 

1220 1225 1230 

Lys Arg . Lys Asn Asp Thr Leu Asp Pro Glu Leu Phe Vai Giu Cys Thr 

1235 1240 1245 

Glu Cys Giy Arg Lys Met His Gin He Cys Val Leu His His Giu lie 

1250 1255 1260 

lie Trp Pro Ala Giy Phe Vai Cys Asp Giy Cys Leu Lys Lys Ser Ala 
265 1270 1275 1280 

Arg Thr Arg Lys Giu Asn Lys Phe Ser Ala Lys Arg Leu Pro Ser Thr 

1285 1290 1295 

Arg Leu Giy Thr Phe Leu Giu Asn Arg Vai Asn Asp Phe Leu Arg Arg 

1300 1305 1310 

Gin Asn His Pro Glu Ser Giy Glu Val Thr Vai Arg Vai Val His Ala 

1315 1320 1325 

Ser Asp Lys Thr Val Giu Vai Lys Pro Giy Met Lys Ala Arg Phe Val 

1330 1335 1340 

Asp Ser Giy Glu Met Ala Glu Ser Phe Pro Tyr Arg Thr Lys Ala Leu 
345 ■ 1350 1355 1360 

Phe Ala Phe Glu Giu lie Asp Giy Val Asp Leu Cys Phe Phe Giy Met 

1365 1370 1375 

His Val Gin Glu Tyr Giy Ser Asp Cys Pro Pro Pro Asn Gin Arg Arg 

1380 1385 1390 

Val Tyr lie Ser Tyr Leu Asp Ser Val His Phe Phe Arg Pro Lys Cys 

1395 1400 1405 

Leu Arg Thr Ala Vai Tyr His Glu He Leu He Giy Tyr Leu Giu Tyr 

1410 1415 1420 

Val Lys Lys Leu Giy Tyr Thr Thr Giy His He Trp Ala Cys Pro Pro 
425 1430 1435 1440 

Ser Glu Giy Asp Asp Tyr He Phe His Cys His Pro Pro Asp Gin Lys 

1445 1450 1455 

He Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr Lys Lys Met Leu Asp 

1460 1465 1470 

Lys Ala Val Ser Glu Arg He Vai His Asp Tyr Lys Asp He Phe Lys 

1475 1480 1485 

Gin Ala Thr Glu Asp Arg Leu Thr Ser Ala Lys Giu Leu Pro Tyr Phe 

1490 1495 1500 

Glu Giy Asp Phe Trp Pro Asn Val Leu Glu Glu Ser He Lys Giu Leu 
505 1510 1515 1520 

Glu Gin Glu Glu Glu Glu Arg Lys Arg Giu Glu Asn Thr Ser Asn Giu 

1525 1530 . 1535 • 

Ser Thr Asp Val Thr Lys Giy Asp Ser Lys Asn Ala Lys Lys Lys Asn 

•1540 • 1545 1550- 

Asn Lys Lys Thr Ser Lys Asn Lys Ser Ser Leu Ser Arg Giy Asn Lys 

1555 1560 1565 

Lys Lys Pro Giy Met Pro Asn Vai Ser Asn Asp Leu Ser Gin Lys Leu 

1570 1575 1580 

Tyr A-La Thr Met Glu Lys His Lys Giu Vai Phe Phe Vai He Arg Leu 
585 1590 1595 1600 
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lie Ala Gly Pro Ala Ala Asn Ser Leu Pro Pro He Val Asp Pro Asp 

1605 1610 1615 

Pro Leu He Pro Cys Asp Leu Met Asp Gly Arg Asp Ala Phe Leu Thr 

1620 1625 1630 

Leu Ala Arg Asp Lys His Leu Glu Phe Ser Ser Leu Arg Arg Ala Gin 

1635 1640 1645 

Trp ser Thr Met Cys Met Leu Val Glu Leu His Thr Gin Ser Gin Asp 

1650 1655 1660 

Arq Phe Val Tyr Thr Cys Asn Glu Cys Lys His His Val Glu Thr Arg 
665 1670 1675 . 1680 

Trp His Cys Thr Val Cys Glu Asp Tyr Asp Leu Cys He Thr Cys Tyr 

1685 1690 1695 

Asn Thr Lys Asn His Asp His Lys Met Glu Lys Leu Gly Leu Gly Leu 

1700 . 1705 1710 

Asp Asp Glu ser Asn Asn Gin Gin Ala Ala Ala Thr Gin Ser Pro Gly 

1715 1720 1725 

Asp Ser Arg Arg Leu Ser He Gin Arg Cys He Gin Ser Leu Val His 

1730 1735 1740 

Ala CVS Gin Cys Arg Asn Ala Asn Cys Ser Leu Pro Ser Cys Gin Lys 
745 1750 1755 1760 

Met Lys Arg Val Val Gin His Thr Lys Gly Cys Lys Arg Lys Thr Asn 

1765 1770 . 1775 

Glv Gly Cys Pro He Cys Lys Gin Leu He Ala Leu Cys Cys Tyr His • 

1780 1785 1790 

Ala Lys His Cys Gin Glu Asn Lys Cys Pro Val Pro Phe Cys Leu Asn 

1795 1800 1805 

He Lys Gin Lys Leu Arg Gin Gin Gin Leu Gin His Arg Leu Gin Gin 

1810 1815 1820 

Ala Gin Met Leu Arg Arg Arg Met Ala Ser Met Gin Arg Thr Gly Val 
825 1830 1835 1840 

Val Gly Gin Gin Gin Gly Leu Pro Ser Pro Thr Pro Ala Thr Pro Thr 

1845 1850 1855 

Thr Pro Thr Gly Gin Gin Pro Thr Thr Pro Gin Thr Pro Gin Pro Thr 

I860 1865 1870 

Ser Gin Pro Gin Pro Thr Pro Pro Asn Ser Met Pro Pro Tyr Leu Pro 

1875 1880 1885 

Arg Thr Gin Ala Ala Gly Pro Val Ser Gin Gly Lys Ala Ala Gly Gin 

1890 1895 1900 

Val Thr Pro Pro Thr Pro Pro Gin Thr Ala Gin Pro Pro Leu Pro Gly 
905 1910 1915 1920 

Pro Pro Pro Thr Ala Val Glu Met Ala Met Gin He Gin Arg Ala Ala 

1925 1930 1935 

Glu Thr Gin Arg Gin Met Ala His Val Gin He Phe Gin Arg Pro He 

1940 1945 1950 

Gin His Gin Met Pro Pro Met Thr Pro Met Ala Pro Met Gly Met Asn 

1955 1960 1965 

Pro Pro Pro Met Thr Arg Gly Pro Ser Gly His Leu Glu Pro Gly Met 

1970 1975 1980 

Glv Pro Thr Gly Met Gin Gin Gin Pro Pro Trp Ser Gin Gly Gly Leu 
985 1990 1995 2000 

Pro Gin Pro Gin Gin Leu Gin Ser Gly Met Pro Arg Pro Ala Met Met 

2005 2010 2015 

Ser Val Ala Gin His Gly Gin Pro Leu Asn Met Ala Pro Gin Pro Gly 

2020 2025 2030 

Leu Gly Gin Val Gly He Ser Pro Leu Lys Pro Gly Thr Val Ser Gin 

2035 2040 2045 

Gin Ala Leu Gin Asn Leu Leu Arg Thr Leu Arg Ser Pro Ser Ser Pro 

2050 2055 2060 

Leu Gin Gin Gin Gin Val Leu Ser He Leu His Ala Asn Pro Gin Leu 
065 2070 2075 2080 

Leu Ala Ala Phe He Lys Gin Arg Ala Ala Lys Tyr Ala Asn Ser Asn 
2085 2090 2095 
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Pro Gin. Pro lie Pro Gly Gin Pro Gly Met Pro Gin Gly Gin Pro Gly 

2100 2105 • 2110 

Leu Gin Pro Pro Thr Met Pro Gly Gin Gin Gly Val His Ser Asn Pro 

2115 2120 2125 

Ala Met Gin Asn Met Asn Pro Met Gin Ala Gly Val Gin Arg Ala Gly 

2130 2135 2140 

Leu Pro Gin Gin Gin Pro Gin Gin Gin Leu Gin Pro Pro Met Gly Gly 

145 2150 2155 ■ 2160 

Met Ser Pro Gin Ala Gin Gin Met Asn Met Asn His Asn Thr Met Pro 

2165 2170 2175 

Ser Gin Phe Arg Asp lie Leu Arg Arg Gin Gin Met Met Gin Gin Gin 

2180 2185 ■ 219C 

Gin Gin Gin Gly Ala Gly Pro Gly lie Gly Pro Gly Met Ala Asn His 

2195 2200 2205 

Asn Gin Phe Gin Gin Pro Gin Gly Val Gly Tyr Pro Pro Gin Pro Gin 

2210 2215 2220 

Gin Arg Met Gin His His Met Gin Gin Met Gin Gin Gly Asn Met Gly 

225 2230 2235 2240 

Gin lie Gly Gin Leu Pro Gin Ala Leu Gly Ala Glu. Ala Gly Ala Ser 

2245 2250 2255 

Leu Gin Ala Tyr Gin Gin Arg Leu Leu Gin Gin Gin Met Gly Ser Pro 

2260 2265 2270 

Val Gin Pro. Asn Pro Met Ser Pro Gin Gin His Met Leu Pro Asn Gin 

2275 2280 2285 

Ala Gin Ser Pro His Leu Gin Gly Gin Gin lie Pro Asn Ser Leu Ser 

2290 2295 2300 

Asn Gin Val Arg Ser Pro Gin Pro Val Pro Ser Pro Arg Pro Gin Ser 

305 2310 2315 2320 

Gin Pro Pro His Ser Ser Pro Ser Pro Arg Met Gin Pro Gin Pro Ser 

2325 2330 2335 

Pro His His Val Ser Pro Gin Thr Ser Ser Pro His Pro Gly Leu Val 

2340 2345 2350 

Ala Ala Gin Ala Asn Pro Met Glu Gin Gly His Phe Ala Ser Pro Asp 

2355 2360 2365 

Gin Asn Ser Met Leu Ser Gin Leu Ala Ser Asn Pro Gly Met Ala Asn 

2370 2375 2380 

Leu His Gly Ala Ser Ala Thr Asp Leu Gly Leu Ser Thr Asp Asn Ser 

385 2390 2395 2400 

Asp Leu Asn Ser Asn Leu Ser Gin Ser Thr Leu Asp lie His 

2405 2410 2 

C2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2441 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: None 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Met Ala Glu Asn Leu Leu Asp Gly Pro Pro Asn Pro Lys Arg Ala Lys 

1 5 . 10 15 

Leu Ser Ser Pro Gly Phe Ser Ala Asn Asp Asn Thr Asp Phe Gly Ser 

20 25 30 

Leu Phe Asp Leu Glu Asn Asp Leu Pro Asp Glu Leu lie Pro Asn Gly 

35 ' 40 45 

Glu Leu Ser Leu Leu Asn Ser Gly Asn Leu Val Pro Asp Ala Ala Ser 

50 55 60. 

Lys His Lys Gin Leu Ser Glu Leu Leu Arg Gly Gly Ser Gly Ser Ser 

65 70 75 80 
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lie Asn Pro Gly lie Gly Asn Vai Ser Ala Ser Ser Pro'Vai Gin Gin 

85 90 95 

Gly Leu Gly Gly Gin Ala Gin Gly Gin Pro Asn Ser Thr Asn Met Ala 

100 .105 110 

Ser Leu Gly Ala Met Gly Lys Ser Pro Leu Asn Gin Gly Asp Ser Ser 

115 120 125 

Thr Pro Asn Leu Pro Lys Gin Ala Ala Ser Thr Ser Gly Pro Thr Pro 

130 135 140 

Pro Ala ser Gin Ala Leu Asn Pro Gin Ala Gin Lys Gin Val Gly Leu 
145 150 155 160 

Val Thr Ser Ser Pro Ala Thr Ser Gin Thr Gly Pro Gly lie Cys Met 

165 170 . 1'75 

Asn Ala Asn Phe Asn Gin Thr His Pro Gly Leu Leu Asn Ser Asn Ser 

180 185 190 

Gly His Ser Leu Met Asn Gin Ala Gin Gin Gly Gin Ala Gin Val Met 

195 200 205 

Asn Gly Ser Leu Gly Ala Ala Gly Arg Gly Arg Gly Ala Giy Met Pro 

210 215 220 

Tvr Pro Ala Pro Ala Met Gin Giy Ala Thr Ser Ser Val Leu Ala Glu 
225 230 235 240 

Thr Leu Thr Gin Val Ser Pro Gin Met Ala Gly His Ala Gly Leu Asn 

245 250 255 

Thr Ala Gin' Ala Gly Gly Met Thr Lys Met Giy Met Thr Giy Thr Thr 

260 265 270 

Ser Pro Phe Gly Gin Pro Phe Ser Gin Thr Gly Giy Gin Gin Met Giy 

275 280 285 

Ala Thr Giy Vai Asn Pro Gin Leu Ala Ser Lys Gin Ser Met Vai Asn 

290 295 300 

Ser Leu Pro Ala Phe Pro Thr Asp lie Lys Asn Thr Ser Val Thr Thr 
305 310 315 320 

Val Pro Asn Met Ser Gin Leu Gin Thr Ser Val Gly lie Val Pro Thr 

325 330 335 

Gin Ala lie Ala Thr Giy Pro Thr Ala Asp Pro Glu Lys Arg Lys Leu 

340 345 350 

lie Gin Gin Gin Leu Val Leu Leu Leu His Ala His Lys Cys Gin Arg 

355 360 365 

Arg Glu Gin Ala Asn Giy Glu Vai Arg Ala Cys Ser Leu Pro His Cys 

370 375 380 

Ara Thr Met Lys Asn Vai Leu Asn His Met Thr His Cys Gin Ala Pro 
385 390 395 400 

Lys Ala Cys Gin* Vai Ala His Cys Ala Ser Ser Arg Gin lie lie Ser 

405 410 415 

His Trp Lys Asn Cys Thr Arg His Asp Cys Pro Vai Cys Leu Pro Leu 

420 425 430 

Lys Asn Ala Ser Asp Lys Arg Asn Gin Gin Thr lie Leu Giy Ser Pro 

435 440 445 

Ala Ser Gly lie Gin Asn Thr lie Gly Ser Vai Giy Ala Gly Gin Gin 

450 455 460 

Asn Ala Thr Ser Leu Ser Asn Pro Asn Pro lie Asp Pro Ser Ser Met 
465 470 475 480 

Gin Arg Ala Tyr Ala Ala Leu Gly Leu Pro Tyr Met Asn Gin Pro Gin 

485 490 495 

Thr Gin Leu Gin Pro Gin Vai Pro Giy Gin Gin Pro Ala Gin Pro Pro 

500 505 510 

Ala His Gin Gin Met Arg Thr Leu Asn Ala Leu Gly Asn Asn Pro Met 

515 520 525 

Ser Val Pro PJ-a Giy Giy lie Thr Thr Asp Gin Gin Pro Pro Ash Leu 

530 535 540 

lie Ser Glu Ser Ala Leu Pro Thr Ser Leu Giy Ala Thr Asn Pro Leu 
545 550 555 560 

Met Asn Asp Giy Ser Asn Ser Gly Asn lie Gly Ser Leu Ser thr lie 
565 570 575 
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Ser Gin Ser Thr Ser Pro Ser Gin Pro Arg Lys Lys lie Phe Lys Pro 

1075 1080 1085 

Glu Giu Leu Arg Gin Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg 

1090 1095 ' 1100 

Gin Asp Pro Glu Ser Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu 
105 1110 1115 1120. 

Leu Gly He Pro Asp Tyr Phe Asp lie Val Lys Asn Pro Met Asp Leu 

1125 1130 1135 

Ser Thr He Lys Arg Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro.Trp 

1140 1145 1150 

Gin Tyr Val Asp Asp Val Arg Leu Met Phe Asn Asn Ala Trp Leu Tyr 

1155 1160 1165 

Asn Arg Lys Thr Ser Arg Val Tyr Lys Phe Cys Ser Lys Leu Ala Glu 

1170 - 11*75 1180 

Val Phe Glu Gin Glu He Asp Pro Val Met Gin Ser Leu Gly Tyr Cys 
185 1190 1195 1200 

Cys Gly Arg Lys Tyr Glu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly 

1205 1210 1215 

Lvs Gin Leu Cys Thr He Pro Arg Asp Ala Ala Tyr Tyr Ser Tyr Gin 

1220 1225 1230 

Asn Arg Tyr His Phe Cys Gly Lys Cys Phe Thr Glu He Gin Gly Glu 

1235 1240 1245 

Asn Val Thr Leu Gly Asp Asp Pro Ser Gin Pro Gin* Thr Thr He Ser 

1250 1255 1260 

Lvs Asp Gin Phe Glu Lys Lys Lys Asn Asp Thr Leu Asp Pro Glu Pro 
265 1270 1275 1280 

Phe Val Asp Cys Lys Glu Cys Gly Arg Lys Met His Gin He Cys Val 

1285 1290 1295 

Leu His Tyr Asp He He Trp Pro Ser Gly Phe Val Cys Asp Asn Cys 

1300 1305 1310 

Leu Lys Lys Thr Gly Arg Pro Arg Lys Glu Asn Lys Phe Ser Ala Lys 

131S 1320 1325 

Arg Leu Gin Thr Thr Arg Leu Gly Asn His Leu Giu Asp Arg Val Asn 

1330 1335 1340 

Lvs Phe Leu Arg Arg Gin Asn His Pro Giu Ala Gly Glu Val Phe Val 
345 1350 1355 1360 

Arq Vftl Val Ala Ser Ser Asp Lys Thr Val Glu Val Lys Pro Gly Met 

1365 1370 1375 

LVS Ser Arg Phe Val Asp Ser Gly Glu Met Ser Glu Ser Phe Pro Tyr 

1380 1385 1390 

Arg Thr Lys Ala Leu Phe Ala Phe Glu Glu He Asp Gly Val Asp Val 

1395 , 1400 1405 

cys Phe Phe Gly Met His Val Gin Asp Thr Ala Leu He Ala Pro His 

1410 .1415 1420 

Gin He Gin Gly Cys Val Tyr He Ser Tyr Leu Asp Ser He His Phe 
425 1430 1435 1440 

Phe Arg Pro Arg cys Leu Arg Thr Ala Val Tyr His Glu He Leu He 

1445 1450 1455 

Glv Tyr Leu Glu Tyr Val Lys Lys Leu Val Tyr Val Thr Ala Hxs He 

1460 1465 1470 

Trp Ala cys Pro Pro Ser Glu Gly Asp Asp Tyr He Phe His Cys His 

1475 1480 . 1485 , 

Pro Pro Asp Gin Lys He Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr 

1490 1495 1500 

Lvs Lys Met Leu Asp Lys Ala Phe Ala Glu Arg He He Asn Asp Tyr 
505 1510 1515 1520 

Lys Asp He Phe Lys Gin Ala Asn Giu Asp Arg Leu Thr Ser Ala Lys 

1525 1530 1535 

Glu Leu Pro Tyr Phe Glu Gly Asp Phe Trp Pro Asn Val Leu Giu Glu 

1540 1545 1550 

Se- He Lys Glu Leu Glu Gin Glu Glu Glu Giu Arg Lys 'Lys Giu Glu 
1555 1560 1565 
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Ser Thx Ala Ala Ser Glu Thr Pro Glu Giy Ser Gin Gly Asp Ser Lys 

1570 1575 ^ • 1580 

Asn Ala Lys Lys Lys Asn Asn Lys Lys Thr Asn Lys Asn Lys Ser Ser 
585 1590 1595 1600 

lie Ser Arg Ala Asn Lys Lys Lys Pro Ser Met Pro Asn Val Ser Asn 

1605 1610 1615 

Asp Leu Ser Gin, Lys Leu Tyr Ala Thr Met Glu Lys His Lys Glu Val 

1620. 1625 1630' 

Phe Phe Val He His Leu His Ala Gly Pro Val He Ser Thr Gin Pro 

1635 1640 1645 

Pro He Val Asp Pro Asp Pro Leu Leu Ser Cys Asp Leu Met Asp Gly 

1650 1655 1660 

Arg Asp Ala Phe Leu Thr Leu Ala Arg Asp Lys His Trp Glu Phe Ser 
665 * 1670 1675 ' 1680 

Ser Leu Arg Arg Ser Lys Trp Ser Thr Leu Cys Met Leu Val Glu Leu 

1685 1690 1695 

His Thr Gin Gly Gin Asp Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys 

1700 1705 1710 

His His Val Glu Thr Arg Trp His Cys Thr Val Cys Glu Asp Tyr Asp 

1715 1720 1725 

Leu Cys He Asn Cys Tyr Asn Thr Lys Ser His Thr His Lys Met Val 

1730 1735 1740 

Lys Trp Gly Leu Gly Leu Asp Asp Glu Gly Ser Ser Gin Gly Glu Pro 
745 1750 1755 1760 

Gin Ser Lys Ser Pro Gin Glu Ser Arg Arg Leu Ser He Gin Arg. Cys 

1765 1770 1775 

He Gin Ser Leu Val His Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser 

1780 1785 1790 

Leu Pro Ser Cys Gin Lys Met Lys Arg Val Val Gin His Thr Lys Gly 

1795 1800 1805 

Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys Lys Gin Leu He 

1810 1815' 1820 

Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu Asn Lys Cys Pro 
825 1830 1835 1840 

Val Pro Phe Cys Leu Asn He Lys His Asn Val Arg Gin Gin Gin lie 

1845 1850 1855 

Gin His Cys Leu Gin Gin Ala Gin Leu Met Arg Arg Arg Met Ala Thr 

1860 1865 1870 

Met Asn Thr Arg Asn Val Pro Gin Gin Ser Leu Pro Ser Pro Thr Ser 

1875 1880 1885 

Ala Pro Pro Gly Thr Pro Thr Gin Gin Pro Ser Thr Pro Gin Thr Pro 

1890 1895 1900 

Gin Pro Pro Ala Gin Pro Gin Pro Ser Pro Val Asn Met Ser Pro Ala 
905 1910 1915 1920 

Gly Phe Pro Asn Val Ala Arg Thr Gin Pro Pro Thr He Val Ser Ala 

1925 1930 1935 

Gly Lys Pro Thr Asn Gin Val Pro Ala Pro Pro Pro Pro Ala Gin Pro 

1940 1945 1950 

Pro Pro Ala Ala Val Glu Ala Ala Arg Gin He Glu Arg Glu Ala Gin 

1955 1960 1965 

Gin Gin Gin His Leu Tyr Arg Ala Asn He Asn Asn Gly Met Pro Pro 

1970 1975 1980 

Gly Arg Asp Gly Met Gly Thr Pro Gly Ser Gin Met Thr Pro Val Gly 
985 1990 1995 2000 

Leu Asn Val Pro Arg Pro Asn Gin Val Ser Gly Pro Val Met Ser Ser 

2005 2010 • 2015 

Met Pro Pro Gly Gin Trp Gin Gin Ala Pro He Pro Gin Gin Gin Pro 

2020 2025 2030 

Met Pro Gly Met Pro Arg Pro Val Met Ser Met Gin TUa Gin Ala Ala 

2035 2040 2045 

Val Ala Gly Pro Arg Met Pro Asn Val Gin Pro Asn Arg Ser He Ser 
2050 2055 - 2060 
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Pro Ser P^la Leu Gin Asp Leu Leu Arg Thr Leu Lys Ser Pro Ser Ser 
065 2070 2075 2080 

Pro Gin Gin Gin' Gin Gin Val Leu Asn He Leu Lys Ser AsnPro Gin 

2085 2090 2095 

Leu Met Ala Ala Phe He Lys Gin Arg Thr Ala Lys Tyr Val Ala Asn ' 

2100 . 2105 2110 

Gin Pro Gly Met Gin Pro Gin Pro Giy Leu Gin Ser Gin Pro Gly Met 

2115 2120 . 2125 

Gin Pro Gin Pro Gly Met His Gin Gin Pro Ser Leu Gin Asn Leu Asn 

2130 . 2135 . 2140 

Ala Met Gin Ala Gly Val Pro Arg Pro Gly Vai Pro Pro' Pro Gin Pro 
145 2150 . 2155 2160 

Ala Met Gly Giy Leu Asn Pro Gin Gly Gin Ala Leu Asn lie Met Asn 

2165 2170 2175 

Pro Gly His Asn Pro Asn Met Thr Asn Met Asn Pro Gin Tyr Arg Glu 

2180 2185 2190 

Met Val Arg Arg Gin Leu Leu Gin His Gin Gin Gin Gin Gin Gin Gin 

2195 2200 2205 

Gin Gin Gin Gin Gin Gin Gin Gin Asn Ser Ala Ser Leu Ala Gly Gly 

2210 2215 . 2220 

Met Ala Gly His Ser Gin Phe Gin Gin Pro Gin Gly Pro Gly Gly Tyr 
225 2230 2235 2240 

Ala Pro Ala Met Gin Gin Gin Arg Met Gin Gin His Leu Pro He Gin 

2245 2250 . 2255 

Glv Ser Ser Met Gly Gin Met Ala Ala Pro Met Gly Gin Leu Gly Gin 

2260 2265 2270 

Met Gly Gin Pro Gly Leu Giy Ala Asp Ser Thr Pro Asn He Gin Gin 

2275 2280 2285 

Ala Leu Gin Gin Arg He Leu Gin Gin Gin Gin Met Lys Gin Gin He 

2290 2295 2300 

Glv Ser Pro Giy Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu 
305 2310 2315 2320 

Ser Gly Gin " Pro Gin Ala Ser His Leu Pro Gly Gin Gin He Ala Thr 

2325 2330 2335 

Ser Leu Ser Asn Gin Val Arg Ser Pro Ala Pro Val Gin Ser Pro Arg 

2340 2345 2350 

Pro Gin Ser Gin Pro Pro His Ser Ser Pro Ser Pro Arg He Gin Pro 

2355 2360 2365 

Gin Pro Ser Pro His His Val Ser Pro Gin Thr Gly Thr Pro His Pro 

2370 2375 2380 

Gly Leu" Ala Vai Thr Met Ala Ser Ser Met Asp Gin Gly His Leu Gly 
385 .2390 2395 2400 

Asn Pro Glu Gin Ser Ala Met Leu Pro Gin Leu Asn Thr Pro Asn Arg 

2405 2410 2415 

Ser Ala Leu Ser Ser Glu Leu Ser Leu Val Gly Asp Thr Thr Gly Asp 

2420 2425 2430 

Thr Leu Glu Lys Phe Val Glu Gly Leu 
2435 2440 

(2) , INFORMATION FOR SEQ ID NO:.8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 amino 'acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear . 

(ii) MOLECULE TYPE: None 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



Met Ala Glu Ala Gly Gly Ala Gly Ser Pro Ala Leu Pro Pro Ala Pro 

1 5 .10 15 
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Cr i. w 








100 


Ser 


Leu 


Thr 


Glu 






115 




Val 


Ser 


His 


Leu 




130 






Gly 


He 


Val 


Leu 


145 








Glu 


Asp 


Ala 


Asp 


Arg 


Lys 




i. X C 








180 


Glu 


Lys , 


Lys 


Pro 






195 




Asn 


Phe 


Val 


Gin 




210 






Thr 


Thr 


He 


Glu 


225 








His 


Leu 


Glu 


Ala 


J. ±e 


o er^ 


. VJi y 


Tyr 








260 


Val 


Pro 


Gin 


Phe 






275 




Phe 


Gly 


Arg 


Thr 




290 






Leu 


Leu 


Glu 


Gin 


305 








Arg 


Thr 


Leu 


He 


Glu 


Glu 


Val 


Tyr 








340 


Ser 


Ala 


Ser 


Ser 






355 




Pro 


Pro 


Val 


Thr 




370 






Glu 


Gin 


lie 


Asn 


385 








Gly 


Leu 


Glu 


Ala 


AXa 


Pro 


Gl U 


OX U 








420 


Glu 


Leu 


He 


Asn 






435 




Leu 


Gly 


Pro 


Glu 




450 






Ala 


Ala 


Arg 


Leu 


465 








Gly 


Asn 


Ser 


Leu 


Val 


Gly 


Leu 


Gin 



Pro 


Arg 


Thr 


Leu 


Ala 


Thr 


P ro 


Val 








40 


Gly 


Ser 


Ala 


Arg 






55 




Arg 


Ala 


Lys 


Lys 




70 






Glu 


Glu 


Ser 


Cys 


85 








Thr 


Pro 


Pro 


Arg 


C A V 

oe r 


Cys 


Arg 


Ser 








120 


Glu 


Asn 


Val 


Ser 






135 




Asp 


Val 


Glu 


Tyr 




150 






Thr 


Lys 


Gin 


Val 


165 








Leu 


Gin 


Arg 


Gly 


Pro 


Phe 


Glu 


Lys 








200 


Tyr 


Lys 


Phe 


Ser 






215 




Leu 


Ala 


Lys 


Met 




230 






Pro 


Ser 


Gin 


Arg 


245 








Lys 


Glu 


Asn 


Tyr 


Cys 


Asp 


Ser 


Leu 








280 


Leu 


Leu 


Arg 


Ser 






295 




Ala 


Arg 


Gin 


Lys 




310 






Leu 


Thr 


His 


Phe 


325 








Ser 


Gin 


Asn 


Ser 


Arg 


Tn r 


ser 


Pro 








360 


Gly 


Thr 


Ala 


Leu 






375 




Gly 


Gly 


Arg 


Thr 




390 






Asn 


Pro 


Gly 


Glu 


405 








Ala 


Lys 


Arg 


Ser 


Glu 


Val 


Met 


Ser 








440 


Thr 


Asn 


Phe 


Leu 






455 




Glu 


Glu 


Arg 


Arg 




470 






Asn 


Gin 


Lys 


Pro 


485 








Asn 


Val 


Phe 


Ser 



87 



Ala 


Thr 


Ala 


Ala 


25 








Ala 


Ala 


Ala 


Gly 


He 


Ala 


V Ct X 


ij y o 








60 


Leu 


Glu 


Lys 


Leu 






75 




Lys 


Cys 


Asn 


Gly 




90 






Gly Asp 


Leu 


Gin 


105 








Cys 


Ser 


His 


PJ-a 


Glu 


Glu 


LjI U 


Met 








140 


Leu 


Phe 


Thr 


Cys 






155 




Tyr 


Phe 


Tyr 


Leu 




170 






Lys 


Pro 


Val 


Val 


185 








Pro 


Ser 


He 


Glu 


His 


Leu 


r X. u 


OCX 








220 


Phe 


Leu 


Asn 


Arg 






235 




Arg 


Leu 


Arg 


Ser 




250 






Thr 


Arg 


Trp 


Leu 


265 








Pro 


Arg 


Tyr 


Glu 


Val 


Phe 


i nr 


J. 1 e 








300 


Lvs 


Asp 


Lys 


Leu 






315 




Pro 


Lys 


Phe 


Leu 




330 






Pro 


He 


Trp 


Asp 


3'4 5 








Leu 


Gly 


He 


Gin 


Phe 


Ser 


^e r 


Asn 








380 


Ser 


Pro 


Gly 


Cys 






395 




Lys 


Arg 


Lys 


Met 




410 






Arg 


Val 


Met 


Gly 


425 








Thr 


He 


Thr 


Asp 


Ser 


Ala 


His 


Ser 








460 


Gly 


Val 


He 


Glu 






475 




Asn 


Lys 


Lys 


He 




490 






His 


Gin 


Leu 


Pro 


505 









Gly 


Ser 


Ser 


Ala 




30 






Thr 


Ala 


Glu 


Gly 


45 








Lys 


Ala 


Gin 


Leu 


Gly 


Val 


1 y X 


Ser 








80 


Trp 


Lys 


Asn 


Pro 






95 




Gin 


lie 


He 


Val 




110 






Leu 


Ala 


Ala 


His 


12 5' 








Asp 


Arg 


Leu 


Leu 


Val 


n X ^ 


Lys 


ox 








160 


Phe 


Lys 


Leu 


Leu 






175 




Glu 


Gly 


Ser 


Leu 




190 






Gin 


Gly 


Val 


Asn 


205 






Lys 


Glu 


Arg 


Gin 


X X c 


As n 


i yr 


I rp 








240 


Pro 


Asn 


Asp 


Asp 






255 




Cys 


Tyr 


Cys 


Asn 




■270 






Thr 


Thr 


Lys 


Val 


285 








Met 


Arg 


Arg 


Gin 


Pro 


Leu 


/"•111 

CjI u 


Lys 








320 


Ser 


Met 


Leu 


Glu 






335 




Gin 


Asp 


Phe 


Leu 




350 






Thr 


Val 


He 


Ser 


365 








Ser 


Thr 


Ser 


His 


Arg 


Gly 


Ser 


^ tr X. 








400 


Asn 


Asn 


Ser 


His 






415 




Asp 


He 


Pro 


Val 




430 






Pro 


Ala 


Gly 


Met 


4 4^5 








Ala 


Arg Asp 


Glu 


Phe 


His 


Val 


Val 








480 


Leu 


Met 


Trp 


Leu 






495 




Arg 


Met 


Pro 


Lys 




510 
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Glu Tyr lie Thr Arg Leu Val Phe Asp Pro Lys His Lys Thr Leu Ala 

515 520 525 

Leu lie Lys Asp Gly Arg Vai He Giy Giy lie Cys Phe Arg Met Phe 

530 535 540 

Pro Se- Gin Gly Phe Thr Glu He Vai Phe Cys Ala Val Thr Ser Asn 
543 550 555 560 

Glu Gin Val Lys Giy Tyr Giy Thr His Leu Met Asn His Leu Lys Glu 

565 ' 570 575 

Tvr His lie Lys His Glu He Leu Asn Phe Leu Thr Tyr Ala Asp Glu 

^ 580. 585 590 • 

Tyr Ala lie Gly Tyr Phe Lys Lys Gin Gly Phe Ser Lys Glu lie Lys 

595- - 600 605 

lie Pro- Lys Thr Lys Tyr Val Gly Tyr He Lys Asp Tyr. Glu Giy Ala 

610 615 620 

Thr Leu Met Gly Cys Glu Leu Asn Pro Gin He Pro Tyr Thr Glu Phe 
625 630 635 640 

ser Val He He Lys Lys Gin Lys Glu He He Lys Lys Leu He Glu 

. • 645 650 655 

Arq Lys Gin Ala Gin He Arg Lys Val Tyr Pro Giy Leu Ser Cys Phe 

660 .665 670 

Lys Asp Gly Val Arg Gin He Pro He Glu Ser He Pro Giy He Arg 

675 680 685 

Glu Thr Gly Trp Lys Pro Ser Gly Lys Glu Lys Ser Lys Glu Pro Lys 

690- 695 700 

AsD Pro Glu His Val Tyr Ser Thr Leu Lys Asn He Leu Gin Gin Vai 
705 710 715 720 

Lvs Asn His Pro Asn Ala Trp Pro Phe Met Glu Pro Val Lys Arg Thr 

725 730 735 

Glu Ala Pro Giy Tyr Tyr Glu Val He Arg Phe Pro Men Asp Leu Lys 

740' 745 750 

Thr Met Ser Glu Arg Leu' Arg Asn Arg Tyr Tyr Vai Ser Lys -Lys Leu 

• 755- 760 765 - 

Phe Met Ala Asp Leu Gin Arg Val Phe Thr Asn Cys Lys Giu Tyr Asn 

770 775 780 

Pro Pro Giu Ser Glu Tyr Tyr Lys Cys Ala Ser He Leu Giu Lys Phe 
785 790 795 800 

Phe Phe Ser Lys He Lys Giu Ala Gly Leu He Asp Lys 
805 810 

(2). INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

His Thr Lys Gly Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Vai Cys 

1 5 10 15 

LVS Gin Leu He Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu 

20 -25 30. 

Asn Lys Cys Pro Val Pro Phe Cys Leu Asn He Lys His Asn Val Arg 

35 40 45 

Gin Gin V 
50 • * . - .. .\ 

{2) INFORMATION FOR SEQ ID NO: 10: 



.(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2204 base pairs 
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(B) TYPE: nucleic acid 
{O STRANDEDNESS : single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACCCACTCCC . CCCAGAGCCG ACCTGCAGCA AATAATTGTC AGTCTAACAG AATCCTGTCG 60 

GAGTTGTAGC CATGCCCTAG CTGCTCATGT TTCCCACCTG GAGAATGTGT CAGAGGAAGA 12 0 

AATGAACAGA CTCCTGGGAA TAGTATTGGA TGTGGAATAT GTCTTTACCT GTGTCCACAA 180 

GGAAGAAGAT GCAGATACCA AACAAGTTTA TTTCTATCTA TTTAAGCTCT TGAGAAAGTC 2 40 

TATTTTACAA AGAGGAAAAC CTGTGGTTGG AAGGCTCTTT GGAAAAGAAA CCCCCATTTG 3 00 

AAAAACCTAG CATTGAACAG GGTGTGAATA ACTTTGTGCA GTACAAATTT AGTCACCTGC 3 60 

CAGCAAAAAG AAAGGCAAAC CAATAGTTGA GTTGGCAAAA ATGTTCCTAA -ACCGCATCAC 42C 

CTATTGGCAT CTGGAGGCAC CATCTCAACG AGACTGCGAT CTCCAATGAT GATATTCTGG 4 80 

ATACAAAGAG AACTACACAA GGTGGCTGTG TTACTGCAAC GTGCCACAGT TCTGCGACAG 54 0 

TCTACCTCGG TACGAAACCA CACAGGTGTT TGGGAGAACA TCGTTCGCTC GGTCTTCACT 600 

GTTATGAGGC GACAACTCCT GGAACAAGCA AGACAGGAAA AAGATAAACT GCCTCTTGAA 660 

AAACGAACTC TAATCCTCAC TCATTTCCCA AAATTTCTGT CCATGCTAGA AGAAGAAGTA 720 

TATAGTCAAA ACTCTCCCAT CTGGGATCAC CATTTTCTCT CAGCCTCTTC CAGAACCAGC 7 80 

CAGCTAGGCA TCCAAACAGT TATCAATCAC CTCCTGTGGC TGGGACAATT TCATACAATT 84 0 

CAACCTCATC TTCCCTTGAG CAGCCAAACG CAGGGAGCAG CAGTCCTGCC TGCAAAGCCT 900 

CTTCTGGACT TGAGGCAAAC CCAGGAGAAA AGAGGAftAAT GACTGATTCT CATGTTCTGG 960 

AGGAGGCCAA' GAAACCCCGA GTTATGGGGG ATATTCCGAT GGAATTAATC AACGAGGTTA 1020 

TGTCTACCAT CACGGACCCT GCAGCAATGC TTGGACCAGA GACCAATTTT CTGTCAGCAC 10 80- 

ACTCGGCCAG GGATGAGGCG GCAAGGTTGG AAGAGCGCAG GGGTGTAATT GAATTTCACG 1140 

TGGTTGGCAA TTCCCTCAAC CAGAAACCAA ACAAGAAGAT CCTGATGTGG CTGGTTGGCC 12 00 

TACAGAACGT TTTCTCCCAC CAGCTGCCCC GAATGCCAAA AGAATACATC ACACGGCTCG 1260 

TCTTTGACCC GAAACACAAA ACCCTTGCTT TAATTAAAGA TGGCCGTGTT ATTGGTGGTA 132 0 

TCTGTTTCCG TATGTTCCCA TCTCAAGGAT TCACAGAGAT TGTCTTCTGT GCTGTAACCT 13 80 

CAAATGAGCA AGTCAAGGGC TATGGAACAC ACCTGATGAA TCATTTGAAA GAATATCACA 14 40 

TAAAGCATGA 'CATCCTGAAC TTCCTCACAT ATGCAGATGA ATATGCAATT GGATACTTTA 1500 

AGAAACAGGG TTTCTCCAAA GAAATTAAAA TACCTAAAAC CAAATATGTT GGCTATATCA 15 60 

AGGATTATGA AGGAGCCACT TTAATGGGAT GTGAGCTAAA TCCACGGATC CCGTACACAG 1620 

AATTTTCTGT CATCATTAAA AAGCAGAAGG AGATAATTAA AAAACTGATT GAAAGAAAAC 1680 

AGGCACAAAT TCGAAAAGTT TACCCTGGAC TTTCATGTTT TAAAGATGGA GTTCGACAGA 17 40 

TTCCTATAGA AAGCATTCCT GGAATTAGAG AGACAGGCTG GAAACCGAGT GGAAAAGAGA 18 00 

AAAGTAAAGA GCCCAGAGAC CCTGACCAGC TTTACAGCAC GCTCAAGAGC ATCCTCCAGC 18 60 

AGGTGAAGAG CCATCAAAGC GCTTGGCCCT TCATGGAACC TGTGAAGAGA ACAGAAGCTC 192 0 

CAGGATATTA TGAAGTTATA AGGTCCCCCA TGGATCTCAA AACCATGAGT GAACGCCTCA 19S0 

AGAATAGGTA CTACGTGTCT AAGAAATTAT TCATGGCAGA CTTACAGCGA GTCTTTACCA 2 04 0 

ATTGCAAAGA GTACAACGCC CCTGAGAGTG AATACTACAA ATGTGCCAAT ATCCTGGAGA 2100 

AATTCTTCTT CAGTAAAATT AAGGAAGCTG GATTAATTGA CAAGTGATTT ' TTTTTCCCCC 2160 

TCTGCTTCTT AGAAACTCAC CAAGCAGTGT GCCTAAAGCA AGGT 22 04 

(2) INFORMATION FOR SEQ ID NO:ll: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

GAATTCCGGC GAAACCACTC ATGTCTTTGG GCGAAGCCTT CTCCGGTCCA TTTTCACCGT 60 

TACCCGCCGG CAGCTGCTGG AAAAGTTCCG AGTGGAGAAG GACAAATTGG TGCCCGAGAA 12 0 

GAGGACCCTC ATCCTCACTC ACTTCCCCAA GTAAGGCTCC TTCTGGCCTA CCAGGATTTG 180 

GCCCCAAGTT CACATCCTCC CTGTTGTCCC CTTTTTTCCA GGAAGGCTTC CTGGATTGGT 2^5 0 

CCCTCCTCTC CCTCCATGGG CCTTTTGGGA TCTGGGCGTC TACCTGGCAG ACTTGCCCAT 3 00 

GGCCCAGAAG CAACTTGCTA GTACTAGTCT GGGGATGGCA GATTCCTGTC CATGCTGGAG 3 60 

GAGGAGATCT ATGGGGCAAA CTCTCCAATC TGGGAGTCAG GCTTCACCAT GCCACCCTCA 42 0 
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480 



GAGGGGACAC AGCTGGTTCC CCGGCCAGCT TCAGTCAGTG CAGCGGTTGT TCCCAGCACC 

CCCATCTTCA GCCCCAGCAT GGGTGGGGGC AGCAACAGCT CCCTGAGTCT GGATTCTGCA 540 

GGGGCCGAGC CTATGCCAGG CGAGAAGAGG ACGCTCCCAG AGAACCTGAC CCTGGAGGAT 600 

GCCAAGCGGC TCCGTGTGAT GGGTGACATC CCCATGGAGC TGGTCAATGA GGTCATGCTG 660 

ACCATCACTG ACCCTGCTGC CATGCTGGGG CCTGAGACGA GCCTGCTTTC GGCCAATGCG 72 0 

GCCCGGGATG AGACAGCCCG CCTGGAGGAG CGCCGCGGCA TCATCGAGTT CCATGTCATC 780 

GGCAACTCAC TGACGCCCAA GGCCAACCGG CGGGTGTT.GC TGTGGCTCGT GGGGCTGCAG 8 40 

AATGTCTTTT CCCACCAGCT GCCGCGCATG CCTAAGGAGT ATATCGCCCG CCTCGTCTTT 900 

GACCCGAAGC ACAAGACTCT GGCCTTGATC AAGGATGGGC GGGTCATCGG TGGCATCTGC 9 60 

TTCCGCATGT TTCCCACCCA GGGCTTCACG GAGATTGTCT TCTGTGCTGT CACCTCGAAT 1020 

GAGCAGGTCA AGGGTTATGG GACCCACCTG ATGAACCACC TGAAGGAGTA TCACATCAAG 108 0 

CACAACATTC TCTACTTCCT CACCTACGCC GACGAGTACG CCATCGGCTA CTTCAAAAAG 114 0 

CAGGGTTTCT CCAAGGACAT CAAGGTGCCC AAGAGCCGCT ACCTGGGCTA CATCAAGGAC 1200 

TACGAGGGAG CGACGCTGAT GGAGTGTGAG CTGAATCCCC GCATCCCCTA CACGGAGCTG 12 60 

TCCCACATCA TCAAGAAGCA GAAAGAGATC ATCAAGAAGC TGATTGAGCG CAAACAGGCC 132 0 

CAGATCCGCA AGGTCTACCC GGGGCTCAGC TGCTTCAAGG AGGGCGTGAG GCAGATCCCT 1380 

GTGGAGAGCG TTCCTGGCAT TCGAGAGACA GGCTGGAAGC CATTGGGGAA GGAGAAGGGG 1440 

AAGGAGCTGA AGGACCCCGA CCAGCTCTAC ACAACCCTCA AAAACCTGCT GGCCCAAATC . 150 0 

AAGTCTCACC CCAGTGCCTG GCCCTTCATG GAGCCTGTGA AGAAGTCGGA GGCCCCTGAC 1560 

TACTACGAGG TCATCCGCTT CCCCATTGAC CTGAAGACCA TGACTGAGCG GCTGCGAAGC 162 0 

CGCTACTACG TGACCCGGAA GCTCTTTGTG GCCGACCTGC AGCGGGTCAT CGCCAACTGT 168 0 

CGCGAGTACA ACCCCCCGGA CAGCGAGTAC TGCCGCTGTG CCAGCGCCCT GGAGAAGTTC 17 40 

TTCTACTTCA AGCTCAAGGA GGGAGCCCTC ATTGACAAGT AGGCCCATCT TTGGGCCGCA 18 00 

GCCCTGACCT GGAATGTCTC CACCTCGGAT TCTGATCTGA TCCTTAGGGG GTGCCCTGGC 18 60 

CCCACGGACC CGACTCAGCT TGAGACACTC CAGCCAAGGG TCCTCCGGAC CCGATCCTGC 1920 

AGCTCTTTCT GGACCTTCAG GCACCCCCAA GCGTGCAGCT CTGTCCCAGC CTTCACTGTG 19 80 

TGTGAGAGGT CTCCTGGGTT GGGGCCCAGC CCCTCTAGAG TAGCTGGTGG CCAGGGATGA 20 40 

ACCTTGCCCA GCCGTGGTGG CCCCCAGGCC TGGTCCCCAA GAGCCCGGAA TTC 2093 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9046 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCTTGTTTGT GTGCTAGGCT GGGGGGGAGA GAGGGCGAGA GAGAGCGGGC GAGAGTGGGC 60 

AAGCAGGACG CCGGGCTGAG TGCTAACTGC GGGACGCAGA GAGTGCGGAG GGGAGTCGGG 120 

TCGGAGAGAG GCGGCAGGGG CCAGAACAGT GGCAGGGGGC CCGGGGCGCA CGGGCTGAGG 180 

CGACCCCCAG CCCCCTCCCG tCCGCACACA CCCCCACCGC GGTCCAGCAG CCGGGCCGGC 2 40 

GTCGACGCTA GGGGGGACCA TTACATAACC CGCGCCCCGG CCGTCTTCTC CCGCCGCCGC 300 

GGCGCCCGAA CTGAGCCCGG GGCGGGCGCT CCAGCACTGG CCGCCGGCGT GGGGCGTAGC 360 

AGCGGCCGTA TTATTATTTC GCGGAAAGGA AGGCGAAGGA GGGGAGCGCC GGCGCGAGGA 420 

GGGGCCGCCT GCGCCCGCCG CCGGAGCGGG GCCTCCTCGG TGGGCTCCGC GTCGGCGCGG 4 80 

GCGTGCGGGC GGCGCTGCTC GGCCCGGCCC CCTCGGCCCT CTGGTCCGGC CAGCTCCGCT 540 

CCCGGCGTCC TTGCCGCGCC TCCGCCGGCC GCCGCGCGAT GTGAGGCGGC GGCGCCAGCC 600 

TGGCTCTCGG CTCGGGCGAG TTCTCTGCGG CCATTAGGGG CCGGTGCGGC GGCGGCGCGG 660 

AGCGCGGCGG CAGGAGGAGG GTTCGGAGGG TGGGGGCGCA GGCCCGGGAG GGGGCACCGG 72 0 

GAGGAGGTGA GTGTCTCTTG TCGCCTCCTC CTCTCCCCCC TTTTCGCCCC CGCCTCCTTG 780 

TGGCGATGAG AAGGAGGAGG ACAGCGCCGA GGAGGAAGAG GTTGATGGCG GCGGCGGAGC 840 

TCCGAGAGAC CTCGGCTGGG CAGGGGCCGG CCGTGGCGGG CCGGGGACTG CGCCTCTAGA 900 

GCCGCGAGTT CTCGGGAATT CGCCGCAGCG GACCGGCCTC GGCGAATTTG TGCTCTTGTG 960 

CCCTCCTCCG GGCTTGGGCC AGGCCGGCCC CTCGCACTTG CCCTTACCTT TTCTATCGAG 102 0 

TCCGCATCCC TCTCCAGCCA CTGCGACCCG GCGAAGAGAA AAAGGAACTT CCCCCACCCC 1080 

CTCGGGTGCC GTCGGAGCCC CCCAGCCCAC CCCTGGGTGC GGCGCGGGGA CCCCGGGCCG 1140 

AAGAAGAGAT TTCCTGAGGA TTCTGGTTTT CCTCGCTTGT ATCTCCGAAA GAATTAAAAA 1200 

TGGCCGAGAA TGTGGTGGAA CCGGGGCCGC CTTCAGCCAA' GCGGCCTAAA CTCTCATCTC 1260 

CGGCCCTCTC GGCGTCCGCC AGCGATGGCA CAGATTTTGG CTCTCTATTT GACTTGGAGC 132 0 

ACGACTTACC AGATGAATTA ATCAACTCTA CAGAATTGGG ACTAACCAAT GGTGGTGATA 1380 
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TTAATCAGCT TCAGACAAGT CTTGGCATGG TACAAGATGC AGCTTCTAAA CATAAACAGC 1440 

TGTCAGAATT GCTGCGATCT GGTAGTTCCC CTAACCTCAA TATGGGAGTT GGTGGCCCAG- 1500 

GTCAAGTCAT GGCCAGCCAG GCCCAACAGA GCAGTCCTGG ATTAGGTTTG ATAAATAGCA 1560 

TGGTCAAAAG CCCAATGACA CAGGCAGGCT TGACTTCTCC CAACATGGGG ATGGGCACTA 162 0 

GTGGACCAAA TCAGGGTCCT ACGCAGTCAA CAGGTATGAT GAACAGTCCA GTAAATCAGC 168 0 

CTGCCATGGG AATGAACACA GGGACGAATG CGGGCATGAA TCCTGGAATG .TTGGCTGCAG- 17 4 0 

GCAATGGACA AGGGATAATG CCTAATCAAG TCATGAACGG TTCAATTGGA GCAGGCCGAG 18 00 

GGCGACAGGA TATGCAGTAC CCAAACCCAG GCATGGGAAG TGCTGGCAAC TTACTGACTG 18 60 

AGCCTCTTCA GCAGGGCTCT CCCCAGATGG GAGGACAAAC AGGATTGAGA GGCCCCCAGC 192 0 
CTCTTAAGAT GGGAATGATG AACAACCCCA ATCCTTATGG TTCACCATAT ACTCAGAATC ' 198 0 

CTGGACAGCA GATTGGAGCC AGTGGCCTTG GTCTCCAGAT TCAGACAAAA ACTGTACTAT 20 4 0 

CAAATAACTT ATCTCCATTT GCTATGGACA AAAAGGCAGT TCCTGGTGGA GGAATGCCCA 2100 

ACATGGGTCA ACAGCCAGCC CCGCAGGTCC AGCAGCCAGG TCTGGTGACT CCAGTTGCCC 2160 

AAGGGATGGG TTCTGGAGCA CATACAGCTG ATCCAGAGAA GCGCAAGCTC ATCCAGCAGC 222 0 

AGCTTGTTCT CCTTTTGCAT GCTCACAAGT GCCAGCGCCG GGAACAGGCC AATGGGGAAG 22 8 0 

TGAGGCAGTG CAACCTTCCC CACTGTCGCA CAATGAAGAA TGTCCTAAAC CACATGACAC 234 0 

ACTGCCAGTC AGGCAAGTCT TGCCAAGTGG CACACTGTGC ATCTTCTCGA CAAATCATTT 2 4 00 

CACACTGGAA GAATTGTACA AGACATGATT GTCCTGTGTG TCTCCCCCTC AAAAATGCTG 2 4 60 

GTGATAAGAG AAATCAACAG CCAATTTTGA CTGGAGCACC CGTTGGACTT GGAAATCCTA 2 52 0 

GCTCTCTAGG GGTGGGTCAA CAGTCTGCCC CCAACCTAAG CACTGTTAGT CAGATTGATC 2 58 0 

CCAGCTCCAT AGAAAGAGCC TATGCAGCTC TTGGACTACC CTATCAAGTA AATCAGATGC 2 64 0 

CGACACAACC CCAGGTGCAA GCAAAGAACC AGCAGAATCA GCAGCCTGGG CAGTCTGCCC 27 00 

AAGGCATGCG GCCCATGAGC AACATGAGTG CTAGTCCTAT GGGAGTAAAT GGAGGTGTAG 27 60 

GAGTTCAAAC GCCGAGTCTT CTTTCTGACT CAATGTTGCA TTCAGCCATA AATTCTCAAA 2820 
ACCCAATGAT GAGTGAAAAT GCCAGTGTGC CCTCCCTGGG TCCTATGCCA ACAGCAGCTC - 28 80 

AACCATCCAC TACTGGAATT CGGAAACAGT GGCACGAAGA TATTACTCAG GATCTTCGAA 2 94 0 

ATCATCTTGT TCACAAACTC GTCCAAGCCA TATTTCCTAC GCCGGATCCT GCTGCTTTAA 3000 

AAGACAGACG GATGGAAAAC CTAGTTGCAT ATGCTCGGAA AGTTGAAGGG GACATGTATG 3060 

AATCTGCAAA CAATCGAGCG GAATACTACC ACCTTCTAGC TGAGAAAATC TATAAGATCC 312 0 

AGAAAGAACT AGAAGAAAAA CGAAGGACCA GACTACAGAA GCAGAACATG CTACCAAATG 318 0 

CTGCAGGCAT GGTTCCAGTT TCCATGAATC CAGGGCCTAA CATGGGACAG CCGCAACCAG 32 4 0 

GAATGACTTC TAATGGCCCT CTACCTGACC CAAGTATGAT CCGTGGCAGT GTGCCAAACC 33 00 

AGATGATGCC TCGAATAACT CCACAATCTG GTTTGAATCA ATTTGGCCAG ATGAGCATGG 33 60 

CCCAGCCCCC TATTGTACCC CGGCAAACCC CTCCTCTTCA GCACCATGGA CAGTTGGCTC 3420 

AACCTGGAGC TCTCAACCCG CCTATGGGCT ATGGGCCTCG TATGCAACAG CCTTCCAACC 34 8 0 

AGGGCCAGTT CCTTCCTCAG ACTCAGTTCC . CATCACAGGG AATGAATGTA ACAAATATCC 3540 

CTTTGGCTCC GTCCAGCGGT CAAGCTCCAG TGTCTCAAGC ACAAATGTCT AGTTCTTCCT 3 600 

GCCCGGTGAA CTCTCCTATA ATGCCTCCAG GGTCTCAGGG GAGCCACATT CACTGTCCCC 3 660 

AGCTTCCTCA ACCAGCTCTT CATCAGAATT CACCCTCGCC TGTACCTAGT CGTACCCCCA 37 2 0 

CCCCTCACCA TACTCCCCCA AGCATAGGGG CTCAGCAGCC ACCAGCAACA ACAATTCCAG 37 8 0 

CCCCTGTTCC TACACCACCA GCCATGCCAC CTGGGCCACA GTCCCAGGCT CTACATCCCC 38 4 0 

CTCCAAGGCA GACACCTACA CCACCAACAA CACAACTTCC CCAACAAGTG CAGCCTTCAC 3900 

TTCCTGCTGC ACCTTCTGCT GACCAGCCCC AGCAGCAGCC TCGCTCACAG CAGAGCACAG 3 960 

CAGCGTCTGT TCCTACCCCA AACGCACCGC TGCTTCCTCC GCAGCCTGCA ACTCCACTTT 4 02 0 

CCCAGCCAGC TGTAAGCATT GAAGGACAGG TATCAAATCC TCCATCTACT AGTAGCACAG 4 08 0 

AAGTGAATTC TCAGGCCATT GCTGAGAAGC AGCCTTCCCA GGAAGTGAAG ATGGAGGCCA 4140 

AAATGGAAGT GGATCAACCA GAACCAGCAG ATACGCAGCC GGAGGATATT TCAGAGTCTA 42 00 
AAGTGGAAGA CTGTAAAATG GAATCTACCG AAACAGAAGA GAGAAGCACT GAGTTAAAAA ■ 42 60 

CTGAAATAAA AGAGGAGGAA GACCAGCCAA GTACTTCAGC TACCCAGTCA TCTCCGGCTC 4 32 0 

CAGGACAGTC AAAGAAAAAG ATTTTCAAAC CAGAAGAACT ACGACAGGCA CTGATGCCAA 4380 

CATTGGAGGC ACTTTACCGT CAGGATCCAG AATCCCTTCC CTTTCGTCAA CCTGTGGACC 44 4 0 

CTCAGCTTTT AGGAATCCCT GATTACTTTG ATATTGTGAA GAGCCCCATG GATCTTTCTA 4 500 

CCATTAAGAG GAAGTTAGAC ACTGGACAGT ATCAGGAGCC. CTGGCAGTAT GTCGATGATA 4 560 

TTTGGCTTAT GTTCAATAAT GCCTGGTTAT ATAACCGGAA AACATCACGG GTATACAAAT 4 62 0 

ACTGCTCCAA GCTCTCTGAG GTCTTTGAAC AAGAAATTGA CCCAGTGATG. CAAAGCCTTG 4 68 0 

GATACTGTTG TGGCAGAAAG TTGGAGTTCT CTCCACAGAC ACTGTGTTGC TACGGCAAAC 47 4 0 

AGTTGTGCAC AATACCTCGT GATGCCACTT ATTACAGTTA CCAGAACAGG TATCATTTCT 4 8 00 

GTGAGAAGTG TTTCAATGAG ATCCAAGGGG AGAGCGTTTC TTTGGGGGAT GACCCTTCCC 4 8 60 

AGCCTCAAAC TACAATAAAT AAAGAACAAT TTTCCAAGAG AAAAAATGAC ACACTGGATC 4 92 0 

CTGAACTGTT TGTTGAATGT ACAGAGTGCG GAAGAAAGAT GCATCAGATC TGTGTCCTTC 4 98 0 

ACCATGAGAT CATCTGGCCT GCTGGATTCG TCTGTGATGG CTGTTTAAAG AAAAGTGCAC 504 0 

GAACTAGGAA AGAAAATAAG TTTTCTGCTA AAAGGTTGCC ATCTACCAGA CTTGGCACCT 5100 

TTCTAGAGAA TCGTGTGAAT GACTTTCTGA GGCGACAGAA TCACCCTGAG TCAGGAGAGG 5160 
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TCACTGTTAG AGTAGTTCAT GCTTCTGACA AAACCGTGGA AGTAAAACCA GGCATGAAAG 522 C 

CAAGGTTTGT GGACAGTGGA GAGATGGCAG AATCCTTTCC ATACCGAACC AAAGCCCTCT 528 0 

TTGCCTTTGA AGAAATTGAT GGTGTTGACC TGTGCTTCTT TGGCATGCAT GTTCAAGAGT 5 34 0 

ATGGCTCTGA CTGCCCTCCA CCCAACCAGA GGAGAGTATA CATATCTTAC CTCGATAGTG 5400 

TTCATTTCTT CCGTCCTAAA TGCTTGAGGA CTGCAGTCTA TCATGAAATC CTAATTGGAT 54 60 

ATTTAGAATA TGTCAAGAAA TTAGGTTACA CAACAGGGCA TATTTGGGCA TGTCCACCAA 5 52 0 

GTGAGGGAGA TGATTATATC TTCCATTGCC ATCCTCCTGA CCAGAAGATA CCCAAGCCCA 5 580 

AGCGACTGCA GGAATGGTAC AAAAAAATGC TTGACAAGGC TGTATCAGAG CGTATTGTCC 5 64 0 

ATGACTACAA GGATATTTTT AAACAAGCTA CTGAAGATAG ATTAACAAGT GCAAAGGAAT 57 00 

TGCCTTATTT CGAGGGTGAT TTCTGGCCCA ATGTTCTGGA AGAAAGCATT AAGGAACTGG 57 60 

AACAGGAGGA AGAAGAGAGA AAACGAGAGG AAAACACCAG CAATGAAAGC ACAGATGTGA 5820 

CCAAGGGAGA CAGCAAAAAT GCTAAAAAGA AGAATAATAA GAAAACCAGC AAAAATAAGA 58 8 0 

GCAGCCTGAG TAGGGGCAAC AAGAAGAAAC CCGGGATGCC CAATGTATCT AACGACCTCT 5940 

CACAGAAACT ATATGCCACC ATGGAGAAGC ATAAAGAGGT CTTCTTTGTG ATCCGCCTCA 6000 

TTGCTGGCCC TGCTGCCAAC TCCCTGCCTC CCATTGTTGA TCCTGATCCT CTCATCCCCT 6060 

GCGATCTGAT GGATGGTCGG GATGCGTTTC TCACGCTGGC AAGGGACAAG CACCTGGAGT 612 0 

TCTCTTCACT CCGAAGAGCC CAGTGGTCCA CCATGTGCAT GCTGGTGGAG CTGCACACGC 618 0 

AGAGCCAGGA CCGCTTTGTC TACACCTGCA ATGAATGCAA GCACCATGTG GAGACACGCT 62 4 0 

GGCACTGTAC TGTCTGTGAG GATTATGACT TGTGTATCAC CTGCTATAAC ACTAAAAACC 63 00 

ATGACCACAA AATGGAGAAA CTAGGCCTTG GCTTAGATGA TGAGAGCAAC AACCAGCAGG " 63 60 

CTGCAGCCAC CCAGAGCCCA GGCGATTCTC GCCGCCTGAG TATCCAGCGC TGCATCCAGT 642 0 

CTCTGGTCCA TGCTTGCCAG TGTCGGAATG CCAATTGCTC ACTGCCATCC TGCCAGAAGA 648 0 

TGAAGCGGGT TGTGCAGCAT ACCAAGGGTT GCAAACGGAA AACCAATGGC GGGTGCCCCA 654 0 

TCTGCAAGCA GCTCATTGCC CTCTGCTGCT ACCATGCCAA GCACTGCCAG GAGAACAAAT 6600 

GCCCGGTGCC GTTCTGCCTA AACATCAAGC AGAAGCTCCG GCAGCAACAG CTGCAGCACC 6660 

GACTACAGCA GGCCCAAATG CTTCGCAGGA GGATGGCCAG CATGCAGCGG ACTGGTGTGG 67 2 0 

TTGGGCAGCA ACAGGGCCTC CCTTCCCCCA CTCCTGCCAC TCCAACGACA CCAACTGGCC 67 8 0 

AACAGCCAAC CACCCCGCAG ACGCCCCAGC CCACTTCTCA GCCTCAGCCT ACCCCTCCCA 68 4 0 

ATAGCATGCC ACCCTACTTG CCCAGGACTC AAGCTGCTGG CCCTGTGTCC CAGGGTAAGG 690 0 

CAGCAGGCCA GGTGACCCCT CCAACCCCTC CTCAGACTGC TCAGCCACCC CTTCCAGGGC 69 60 

CCCCACCTAC AGCAGTGGAA ATGGCAATGC AGATTCAGAG AGCAGCGGAG ACGCAGCGCC '7020 

AGATGGCCCA CGTGCAAATT TTTCAAAGGC CAATCCAACA CCAGATGCCC CCGATGACTC 7080 

CCATGGCCCC CATGGGTATG AACCCACCTC CCATGACCAG AGGTCCCAGT GGGCATTTGG 714 0 

AGCCAGGGAT GGGACCGACA GGGATGCAGC AACAGCCACC CTGGAGCCAA GGAGGATTGC 7 20 0 

CTCAGCCCCA GCAACTACAG TCTGGGATGC CAAGGCCAGC CATGATGTCA GTGGCCCAGC 7260 

ATGGTCAACC TTTGAACATG GCTCCACAAC CAGGATTGGG CCAGGTAGGT ATCAGCCCAC 7 32 0 

TCAAACCAGG CACTGTGTCT CAACAAGCCT TACAAAACCT TTTGCGGACT CTCAGGTCTC 7 38 0 

CCAGCTCTCC CCTGCAGCAG CAACAGGTGC TTAGTATCCT TCACGCCAAC CCCCAGCTGT 7 44 0 

TGGCTGCATT CATCAAGCAG CGGGCTGCCA AGTATGCCAA CTCTAATCCA CAACCCATCC 7 500 

CTGGGCAGCC TGGCATGCCC CAGGGGCAGC CAGGGCTACA GCCACCTACC ATGCCAGGTC 7 5 60 

AGCAGGGGGT CCACTCCAAT CCAGCCATGC AGAACATGAA TCCAATGCAG GCGGGCGTTC 7 62 0 

AGAGGGCTGG CCTGCCCCAG CAGCAACCAC AGCAGCAACT CCAGCCACCC ATGGGAGGGA 7 68 0 

TGAGCCCCCA GGCTCAGCAG ATGAACATGA ACCACAACAC CATGCCTTCA CAATTCCGAG 7 7 40 

ACATCTTGAG ACGACAGCAA ATGATGCAAC AGCAGCAGCA ACAGGGAGCA GGGCCAGGAA 7 80 0 

TAGGCCCTGG AATGGCCAAC CATAACCAGT TCCAGCAACC CCAAGGAGTT GGCTACCCAC 7 8 60 

CACAGCCGCA GCAGCGGATG CAGCATCACA TGCAACAGAT GCAACAAGGA AATATGGGAC 7 92 0 

AGATAGGCCA GCTTCCCCAG GCCTTGGGAG CAGAGGCAGG TGCCAGTCTA CAGGCCTATC 7 98 0 

AGCAGCGACT CGTTCAGCAA CAGATGGGGT CCCCTGTTCA GCCCAACCCC ATGAGCCCCC 8 04 0 

AGCAGCATAT GCTCCCAAAT "CAGGCCCAGT CCCCACACCT ACAAGGCCAG CAGATCCCTA 8100 

ATTCTCTCTC CAATCAAGTG CGCTCTCCCC AGCCTGTCCC TTCTCCACGG CCACAGTCCC 8160 

AGCCCCCCCA CTCCAGTCCT TCCCCAAGGA TGCAGCCTCA GCCTTCTCCA CACCACGTTT .8220 

CCCCACAGAC AAGTTCCCCA CATCCTGGAC TGGTAGCTGC CCAGGCCAAC CCCATGGAAC 8 28 0 

AAGGGCATTT TGCCAGCCCG GACCAGAATT CAATGCTTTC TCAGCTTGCT AGC--ATCCAG 8 34 0 

GCATGGCAAA CCTCCATGGT GCAAGCGCCA CGGACCTGGG ACTCAGCACC GATAACTCAG 8 400 

ACTTGAATTC AAACCTCTCA CAGAGTACAC TAGACATACA CTAGAGACAC CTTGTATTTT 8 460 

GGGAGCAAAA AAATTATTTT CTCTTAACAA GACTTTTTGT ACTGAAAACA ATTTTTTTGA 8 52 0 

ATCTTTCGTA GCCTAAAAGA CAATTTTCCT TGGAACACAT AAGAACTGTG 'CAGTAGCCGT 8 58 0 

TTGTGGTTTA AAGCAAACAT GCAAGATGAA CCTGAGGGAT GATAGAATAC AAAGAATATA 8 64 0 

' TTTTTGTTAT GGGCTGGTTA- CCACCAGCCT TTCTTCCCCT TTGTGTGTGT GGTTCAAGTG 87 00 

TGCACTGGGA GGAGGCTGAG GCCTGTGAAG CCAAACAATA TGCTCCTGCC TTGCACCTCC 87 60 

AATAGGTTTT ATTATTTTTT TTAAATTAAT GAACATATGT AATATTAATG AACATATGTA 8 8 20 

ATATTAATAG TTATTATTTA CTGGTGCAGA TGGTTGACAT TTTTCCCTAT TTTCCTCACT 88 8 0 

TTATGGAAGA GTTAAAACAT TTCTAAACCA GAGGACAAAA GGGGTTAATG TTACTTTGAA 89 4 0 



wo 98/03652 



PCTAJS97/12877 



93 

ATTACATTCT ATATATATAT AAATATATAT AAATATATAT TAAAATACCA GTTTTTTTTC 90 00 
TCTGGGTGCA AAGATGTTCA TTCTTTTAAA AAATGTTTAA AAAAAA 9 04 6 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7326 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:' 

ATGGCCGAGA ACTTGCTGGA CGGACCGCCC AACCCCAAAC GAGCCAAACT CAGCTCGCCC 60 

GGCTTCTCCG CGAATGACAA CACAGATTTT GGATCATTGT TTGACTTGGA AAATGACCTT 12 0 

CCTGATGAGC TGATCCCCAA TGGAGAATTA AGCCTTTTAA ACAGTGGGAA CCTTGTTCCA 180 

GATGCTGCGT CCAAACATAA ACAACTGTCA GAGCTTCTTA GAGGAGGCAG CGGCTCTAGC 240 

ATCAACCCAG GGATAGGCAA TGTGAGTGCC AGCAGCCCTG TGCAACAGGG CCTTGGTGGC • 300 

CAGGCTCAGG GGCAGCCGAA CAGTACAAAC ATGGCCAGCT TAGGTGCCAT GGGCAAGAGC 3 60 

CCTCTGAACC AAGGAGACTC ATCAACACCC AACCTGCCCA AACAGGCAGC CAGCACCTCT 4 20 

GGGCCCACTC CCCCTGCCTC CCAAGCACTG AATCCACAAG CACAAAAGCA AGTAGGGCTG 480 

GTGACCAGTA GTCCTGCCAC ATCACAGACT GGACCTGGGA TCTGCATGAA TGCTAACTTC 54C 

AACCAGACCC ACCCAGGCCT TCTCAATAGT AACTCTGGCC ATAGCTTAAT GAATCAGGCT 600 

CAACAAGGGC AAGCTCAAGT CATGAATGGA TCTCTTGGGG CTGCTGGAAG AGGAAGGGGA 660 

GCTGGAATGC CCTACCCTGC TCCAGCCATG CAGGGGGCCA CAAGCAGTGT GCTGGCGGAG 720 

ACCTTGACAC AGGTTTCCCC ACAAATGGCT GGCCATGCTG GACTAAATAC AGCACAGGCA 7 80 

GGAGGCATGA CCAAGATGGG AATGACTGGT ACCACAAGTC CATTTGGACA ACCCTTTAGT 840 

CAAACTGGAG GGCAGCAGAT GGGAGCCACT GGAGTGAACC CCCAGTTAGC CAGCAAACAG 900 

AGCATGGTCA ATAGTTTACC TGCTTTTCCT ACAGATATCA AGAATACTTC AGTCACCACT 960 

GTGCCAAATA TGTCCCAGTT GCAAACATCA GTGGGAATTG TACCCACACA AGCAATTGCA 102 0 

ACAGGCCCCA CAGCAGACCC TGAAAAACGC AAACTGATAC AGCAGCAGCT GGTTCTACTG 108 0 

CTTCATGCCC ACAAATGTCA GAGACGAGAG CAAGCAAATG GAGAGGTTCG NGCCTGTTCT -1140 

CTCCCACACT GTCGAACCAT GAAAAACGTT TTGAATCACA TGACACATTG TCAGGCTCCC 12 00 

AAAGCCTGCC AAGTTGCCCA TTGTGCATCT ' TCACGACAAA TCATCTCTCA TTGGAAGAAC 12 60 

TGCACACGAC ATGACTGTCC TGTTTGCCTC CCTTTGAAAA ATGCCAGTGA CAAGCGAAAC 132 0 

CAACAAACCA TCCTGGGATC TCCAGCTAGT GGAATTCAAA ACACAATTGG TTCTGTTGGT 1380 

GCAGGGCAAC AGAATGCCAC TTCCTTAAGT AACCCAAATC CCATAGACCC CAGTTCCATG 1440 

CAGCGGGCCT ATGCTGCTCT AGGACTCCCC TACATGAACC AGCCTCAGAC GCAGCTGCAG 150 0 

CCTCAGGTTC CTGGCCAGCA ACCAGCACAG CCTCCAGCCC ACCAGCAGAT GAGGACTCTC 15 60 

AATGCCCTAG GAAACAACCC CATGAGTGTC CCAGCAGGAG GAATAACAAC AGATCAACAG 162 0 

CCACCAAACT TGATTTCAGA ATCAGCTCTT CCAACTTCCT TGGGGGCTAC CAATCCACTG 168 0 

ATGAATGATG GTTCAAACTC TGGTAACATT GGAAGCCTCA GCACGATACC TACAGCAGCG 17 4 0 

CCTCCTTCCA GCACTGGTGT TCGAAAAGGC TGGCATGAAC ATGTGACTCA GGACCTACGG 18 00 

AGTCATCTAG TCCATAAACT CGTTCAAGCC ATCTTCCCAA CTCCAGACCC TGCAGCTCTG 13 60 

AAAGATCGCC GCATGGAGAA CCTGGTTGCC TATGCTAAGA AAGTGGAGGG AGACATGTAT 192 0 

GAGTCTGCTA ATAGCAGGGA TGAATACTAT CATTTATTAG CAGAGAAAAT CTATAAAATA 198 0 

CAAAAAGAAC TAGAAGAAAA GCGGAGGACA CGTTTACATA AGCAAGGCAT CCTGGGTAAC 204 0 

CAGCCAGCTT TACCAGCTTC TGGGGCTCAG CCCCCTGTGA TTCCACCAGC CCAGTCTGTA 2100 

AGACCTCCAA ATGGGCCCCT GCCTTTGCCA GTGAATCGCA TGCAGGTTTC TCAAGGGATG 2160 

AATTCATTTA ACCCAATGTC CCTGGGAAAC GTCCAGTTGC CACAGGCACC CATGGGACCT 222 0 

CGTGCAGCCT CCCCTATGAA CCACTCTGTG CAGATGAACA GCATGGCCTC AGTTCCGGGT 22 8 0 

ATGGCCATTT CTCCTTCACG GATGCCTCAG CCTCCAAATA TGATGGGCAC TCATGCCAAC 2 340 

AACATTATGG CCCAGGCACC TACTCAGAAC CAGTTTCTGC CACAGAACCA GTTTCCATCA 2400 

TCCAGTGGGG CAATGAGTGT GAACAGTGTG GGCATGGGGC AACCAGCAGC • CCAGGCAGGT 2 4 60 

GTTTCACAGG GTCAGGAACC TGGAGCTGCT CTCCCTAACC CTCTGAACAT' GCTGGCACCC 2 52 0 

CAGGCCAGCC AGCTGCCTTG CCCACCAGTG ACACAGTCAC CATTGCACCC GACTCCACCT . 2 58 0 

CCTGCTTCCA CAGCTGCTGG CATGCCCTCT CTCCAACATC CAACGGCACC AGGAATGACC 2 640 

CCTCCTCAGC CAGCAGCTCC CACTCAGCCA TCTACTCCTG TGTCATCTGG GCAGACTCCT 2 70 0 

ACCCCAACTC CTGGCTCAGT GCCCAGCGCT GCCCAAACAC AGAGTACCCC TACAGTCCAG 2 7 60 

GCAGCAGCAC AGGCTCAGGT GACTCCACAG CCTCAGACCC CAGTGCAGCC ACCATCTGTG 282 0 

GCTACTCCTC AGTCATCACA GCAGCAACCA ACGCCTGTGC ATACTCAGCC ACCTGGCACA 28 3 0 

CCGCTTTCTC AGGCAGCAGC CAGCATTGAT AATAGAGTCC CTACTCCCTC CACTGTGACC 2 94 0 



wo 98/03652 



94 

AGTGCTGAAA CCAGTTCCCA GCAGCCAGGA CCCGATGTGC CCATGCTGGA AATGAAGACA 3000 

GAGGTGCAGA CAGATGATGC TGAGCCTGAA CCTACTGAAT CCAAGGGGGA ACCTCGGTCT 3 060 

GAGATGATGG AAGAGGATTT ACAAGGTTCT TCCCAAGTAA AAGAAGAGAC AGATACGACA 3120 

GAGCAGAAGT CAGAGCCAAT GGAAGTAGAA GAAAAGAAAC CTGAAGTAAA AGTGGAAGCT -3180 

AAAGAGGAAG AAGAGAACAG TTCGAACGAC ACAGCCTCAC AATCAACATC TCCTTCCCAG 32 4 0 

CCACGCAAAA AAATCTTTAA ACCCGAGGAG CTACGCCAGG CACTTATGCC AACTCTAGAA 3300 

GCACTCTATC GACAGGACCC AGAGTCTTTG CCTTTTCGTC AGCCTGTAGA TCCTCAGCTC 3 360 

CTAGGAATCC CAGATTATTT TGATATAGTG AAGAATCCTA TGGACCTTTC TACCATCAAA 342C 

CGAAAGCTGG ACACAGGGCA ATATCAAGAA CCCTGGCAGT ATGTGGATGA TGTCAGGCTT 3 4 80 

ATGTTCAACA ATGCGTGGCT ATATAATCGT AAAACGTCCC GTGTATATAA ATTTTGCAGT 3 5 40 

AAACTTGCAG AGGTCTTTGA ACAAGAAATT GACCCTGTCA TGCAGTCTCT TGGATATTGC 3600 

TGTGGACGAA AGTATGAGTT CTCCCCACAG ACTTTGTGCT GTTACGGAAA GCAGCTGTGT 3 660 

ACAATTCCTC GTGATGCAGC CTACTACAGC TATCAGAATA GGTATCATTT CTGTGGGAAG 37 2 0 

TGTTTCACAG AGATCCAGGG CGAGAATGTG ACCCTGGGTG ACGACCCTTC CCAACCTCAG 37 80 

ACGACAATTT CCAAGGATCA ATTTGAAAAG AAGAAAAATG ATACCTTAGA TCCTGAACCT 38 4 0 

TTTGTTGACT GCAAAGAGTG TGGCCGGAAG ATGGATCAGA TTTGTGTTCT ACACTATGAC 3 900 

ATCATTTGGC CTTCAGGTTT TGTGTGTGAC AACTGTTTGA AGAAAACTGG CAGACCTCGG 3 960 

AAAGAAAACA AATTCAGTGC TAAGAGGCTG CAGACCACAC GATTGGGAAA CCACTTAGAA 4020 

GACAGAGTGA ATAAGTTTTT GCGGCGCCAG AATCACCCTG AAGCTGGGGA GGTTTTTGTC 408 0 

AGAGTGGTGG CCAGCTCAGA CAAGACTGTG GAGGTCAAGC CGGGAATGAA GTCAAGGTTT 414 0 

GTGGATTCTG GAGAGATGTC GGAATCTTTC CCATATCGTA CCAAAGCACT CTTTGCTTTT 4200 

GAGGAGATCG ATGGAGTCGA TGTGTGCTTT TTTGGGATGC ATGTGCAAGA TACGGCTCTG 42 60 

ATTGCCCCCC ACCAAATACA AGGCTGTGTA TACATATCTT ATCTGGACAG TATTCATTTC 4320 

TTCCGGCCCC GCTGCCTCCG GACAGCTGTT TACCATGAGA TCCTCATCGG ATATCTCGAG 4 380 

TATGTGAAGA AATTGGTGTA TGTGACAGCA CATATTTGGG CCTGTCCCCC AAGTGAAGGA 4 4 40 

GATGACTATA TCTTTCATTG CCACCCCCCT GACCAGAAAA TCCCCAAACC AAAACGACTA 4 500 
CAGGAGTGGT ACAAGAAGAT GCTGGACAAG GCGTTTGCAG AGAGGATCAT TAACGACTAT . 4 560 

AAGGACATCT TCAAACAAGC GAACGAAGAC AGGCTCACGA GTGCCAAGGA GTTGCCCTAT 4 62 0 

TTTGAAGGAG ATTTCTGGCC TAATGTGTTG GAAGAAAGCA TTAAGGAACT AGAACAAGAA 4 68 0 

GAAGAAGAAA GGAAAAAAGA AGAGAGTACT GCAGCGAGTG AGACTCCTGA GGGCAGTCAG 4 7 40 

GGTGACAGCA AAAATGCGAA GAAAAAGAAC AACAAGAAGA CCAACAAAAA CAAAAGCAGC 4 800 

ATTAGCCGCG CCAACAAGAA GAAGCCCAGC ATGCCCAATG TTTCCAACGA CCTGTCGCAG 4 8 60 

AAGCTGTATG CCACCATGGA GAAGCACAAG GAGGTATTCT TTGTGATTCA TCTGCATGCT 4 92 0 

GGGCCTGTTA TCAGCACTCA GCCCCCCATC GTGGACCCTG ATCCTCTGCT TAGCTGTGAC 4 980 

CTCATGGATG GGCGAGATGC CTTCCTCACC CTGGCCAGAG ACAAGCACTG GGAATTCTCT 504 0 

TCCTTACGCC GCTCCAAATG GTCCACTCTG TGCATGCTGG TGGAGCTGCA CACACAGGGC 5100 

CAGGA^GCT TTGTTTATAC" CTGCAATGAG TGCAAACACC ATGTGGAAAC ACGCTGGCAC 5160 

TGCACTGTGT GTGAGGACTA TGACCTTTGT ATCAATTGCT ACAACACAAA GAGCCACACC 522 0 

CATAAGATGG TGAAGTGGGG GCTAGGCCTA GATGATGAGG GCAGCAGTCA GGGTGAGCCA 52 80 

CAGTCCAAGA GCCCCCAGGA ATCCCGGCGT CTCAGCATCC AGCGCTGCAT CCAGTCCCTG 5 340 

GTGCATGCCT GCCAGTGTCG CAATGCCAAC TGCTCACTGC CGTCTTGCCA GAAGATGAAG 5400 

CGAGTCGTGC AGCACACCAA GGGCTGCAAG CGCAAGACTA ATGGAGGATG CCCAGTGTGC 54 60 

AAGCAGCTCA TTGCTCTTTG CTGCTACCAC GCCAAACACT GCCAAGAAAA TAAATGCCCT 5 52 0 

GTGCCCTTCT GCCTCAACAT CAAACATAAC GTCCGCCAGC AGCAGATCCA GCACTGCCTG 558 0 

CAGCAGGCTC AGCTCATGCG CCGGCGAATG GCAACCATGA ACACCCGCAA TGTGCCTCAG 564 0 

CAGAGTTTGC CTTCTCCTAC CTCAGCACCA CCCGGGACTC CTACACAGCA GCCCAGCACA 57 00 

CCCCAAACAC CACAGCCCCC AGCCCAGCCT CAGCCTTCAC CTGTTAACAT GTCACCAGCA 57 60 

GGCTTCCCTA ATGTAGCCCG GACTCAGCCC CCAACAATAG TGTCTGCTGG GAAGCCTACC 5820 

AACCAGGTGC CAGCTCCCCC ACCCCCTGCC CAGCCCCCAC CTGCAGCAGT AGAAGCAGCC 58 80 

CGGCAAATTG AACGTGAGGC CCAGCAGCAG CAGCACCTAT ACCGAGCAAA CATCAACAAT 59 4 0 

GGCATGCCCC CAGGACGTGA CGGTATGGGG ACCCCAGGAA GCCAAATGAC TCCTGTGGGC 6000 

CTGAATGTGC CCCGTCCCAA CCAAGTCAGT GGGCCTGTCA TGTCTAGTAT GCCACCTGGG 6060 

CAGTGGCAGC AGGCACCCAT CCCTCAGCAG CAGCCGATGC CAGGCATGCC CAGGCCTGTA 612 0 

ATGTCCATGC AGGCCCAGGC AGCAGTGGCT GGGCCACGGA TGCCCAATGT GCAGCCAAAC 6180 

AGGAGCATCT CGCCAAGTGC CCTGCAAGAC CTGCTACGGA CCCT/aAAGTC ACCCAGCTCT 624 0 

CCTCAGCAGC AGCAGCAGGT GCTGAACATC CTTAAATCAA ACCCACAGCT AATGGCAGCT 630 0 

TTCATCAAAC AGCGCACAGC CAAGTATGTG GCCAATCAGC CTGGCATGCA GCCCCAGCCC 6 360 

GGACTTCAAT CCCAGCCTGG TATGCAGCCC CAGCCTGGCA TGCACCAGGA GCCTAGTTTG 642 0 

CAAAACCTGA ACGCAATGCA AGCTGGTGTG CCACGGCCTG GTGTGCCTCC ACCACAACCA 64 8 0 

GCAATGGGAG GCCTGAATCC CCAGGGACAA GCTCTGAACA TCATGAACCC AGGACACAAC 65 4 0 

CCCAACATGA CAAACATGAA TCCACAGTAC CGAGAAATGG TGAGGAGACA GCTGCTACAG 6600 

CACCAGCAGC AGCAGCAGCA ACAGCAGCAG CAGCAGCAGC AACAACAAAA TAGTGCCAGC 6660 

TTGGCCGGGG GCATGGCGGG ACACAGCCAG TTCCAGCAGC CACAAGGACC TGGAGGTTAT 67 2 0 
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GCCCCAGCCA TGCAGCAGCA ACGCATGCAA CAGCACCTCC CCATCCAGGG CAGCTCCATG 67 8 0 

GGCCAGATGG CTGCTCCAAT GGGACAACTT GGCCAGATGG GGCAGCCTGG GCTAGGGGCA 68 4 0 

GACAGCACCC CTAATATCCA GCAGGCCCTG CAGCAACGGA TTCTGCAGCA GCAGCAGATG 6900 

AAGCAACAAA TTGGGTCACC AGGCCAGCCG AACCCCATGA GCCCCCAGCA GCACATGCTC 6960 

TCAGGACAGC CACAGGCCTC ACATCTCCCT GGCCAGCAGA TCGCCACATC CCTTAGTAAC 7 02O 

CAGGTGCGAT CTCCAGCCCC TGTGCAGTCT CCACGGCCCC AATCCCAACC TCCACATTCC 7 08O 

AGCCCGTCAC CACGGATACA ACCCCAGCCT TCACCACACC ATGTTTCACC CCAGACTGGA 714 0 

ACCCCTCACC CTGGACTCGC AGTCACCATG GCCAGCTCCA TGGATCAGGG ACACCTGGGG 7 2 00 

AACCCTGAAC AGAGTGCAAT GCTCCCCCAG CTGAATACCC CCAACAGGAG CGCACTGTCC 72 60 

AGTGAACTGT CCCTGGTTGG TGATACCACG GGAGACACAC TAGAAAAGTT TGTGGAGGGT 7 32 0 

TTGTAG 7 32 6 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 2499 base pairs 

(B) TYPE.: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TCACTTGTCA ATTAATCCAG CTTCCTTAAT TTTACTGAAG AAGAATTTCT CCAGGATATT 60 

GGCACATTTG TAGTATTCAC TCTCAGGGGC GTTGTACTCT TTGCAATTGG TAAAGACTCG 120 

CTGTAAGTCT GCCATGAATA ATTTCTTAGA CACGTAGTAC CTATTCTTGA GGCGTTCACT 180 

CATGGTTTTG AGATCCATGG GGGACCTTAT AACTTCATAA TATCCTGGAG CTTCTGTTCT 24 0 

CTTCACAGGT TCCATGAAGG GCCAAGCGCT TTGATGGCTC TTCACCTGCT GGAGGATGCT 300 

CTTGAGCGTG CTGTAAAGCT GGTCAGGGTC TCTGGGCTCT TTACTTTTCT CTTTTCCACT 360 

CGGTTTCCAG CCTGTCTCTC TAATTCCAGG AATGCTTTCT ATAGGAATCT GTCGAACTCC 42 0 

ATCTTTAAAA CATGAAAGTC CAGGGTAAAC TTTTCGAATT TGTGCCTGTT TTCTTTCAAT 480 

CAGTTTTTTA ATTATCTCCT TCTGCTTTTT AATGATGACA GAAAATTCTG TGTACGGGAT 54 0 

CCGTGGATTT AGCTCACATC CCATTAAAGT GGCTCCTTCA TAATCCTTGA TATAGCCAAC 600 

ATATTTGGTT TTAGGTATTT TAATTTCTTT GGAGAAACCC TGTTTCTTAA AGTATCCAAT 660 

TGCATATTCA TCTGCATATG TGAGGAAGTT CAGGATGTCA TGCTTTATGT GATATTCTTT 720 

CAAATGATTC ATCAGGTGTG TTCCATAGCC CTTGACTTGC TCATTTGAGG TTACAGCACA 780 

GAAGACAATC TCTGTGAATC CTTGAGATGG GAACATACGG AAACAGATAC CACCAATAAC 84 0 

ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GCCGTGTGAT 900 

GTATTCTTTT GGCATTCGGG GCAGCTGGTG GGAGAAAACG TTCTGTAGGC CAACCAGCCA 960 

CATCAGGATC TTCTTGTTTG GTTTCTGGTT GAGGGAATTG CCAACCACGT GAAATTCAAT 102 0 
TACACCCCTG CGCTCTTCCA ACCTTGCCGC CTCATCCCTG GCCGAGTGTG CTGACAGAAA . 108 0 

ATTGGTCTCT GGTCCAAGCA TTGCTGCAGG GTCCGTGATG GTAGACATAA CCTCGTTGAT 114 0 

TAATTCCATC GGAATATCCC CCATAACTCG GGGTTTCTTG GCCTCCTCCA GAACATGAGA 12 00 

ATCAGTCATT TTCCTCTTTT CTCCTGGGTT TGCCTCAAGT CCAGAAGAGG CTTTGCAGGC 12 60 

AGGACTGCTG CTCCCTGCGT TTGGCTGCTC AAGGGAAGAT GAGGTTGAAT TGTATGAAAT 132 0 

TGTCCCAGCC ACAGGAGGTG GATTGATAAC TGTTTGGATG CCTAGCTGGC TGGTTCTGGA 138 0 

AGAGGCTGAG AGAAAATCCT GATCCCAGAT GGGAGAGTTT TGACTATATA CTTCTTCTTC 14 40 

TAGCATGGAC AGAAATTTTG GGAAATGAGT GAGGATTAGA GTTCGTTTTT CAAGAGGCAG 150 0 

TTTATCTTTT TCCTGTCTTG CTTGTTCCAG GAGTTGTCGC CTCATAACAG TGAAGACCGA 1560 

GCGAAGCAAT GTTCTCCCAA ACACCTGTGT GGTTTCGTAC CGAGGTAGAC TGTCGCAGAA' 162 0 

CTGTGGCACG TTGCAGTAAC ACAGCCACCT TGTGTAGTTC TCTTTGTATC CAGAAATATC 168 0 

ATCATTGGGA GATCGCAGTC TTCGTTGAGA TGGTGCCTCC AGATGCCAAT AGTTGATGCG 17 4 0 

GTTTAGGAAC ATTTTTGCCA ACTCAACTAT' TGTTTGCCTT TCTTTTGCTG GCAGGTGACT 18 00 

AAATTTGTAC TGCACAAAGT TATTCACACC CTGTTCAATG CTAGGTTTTT CAAATGGGGG 18 60 

TTTCTTTTCC AAAGAGCCTT CAACCACAGG TTTTCCTCTT TGTAAAATAG ACTTTCTCAA 192 0 

GAGCTTAAAT AGATAGAAAT AAACTTGTTT GGTATCTGCA TCTTCTTCCT TGTGGACACA 198 0 

GGTAAAGAGA TATTCCACAT CCAATACTAT TCCCAGGAGT CTGTTCATTT CTTCCTCTGA 2 04 0 

CACATTCTCC AGGTGGGAAA CATGAGCAGC TAGGGCATGG CTACAACTCC GACAGGATTC 2100 

TGTTAGACTG ACAATTATTT GCTGCAGGTC GGCTCTGGGG GGAGTGGGTG AGGGGTTAGG 2160 

GTTTTTCCAG CCATTACATT TACAAGACTC CTCGGCCTTG CAGGCGGAGT ACACTCCGAG 22 2 0 

TTTCTCCAGT TTCTTGGCCC GCGGAGCGGA GCGTAGTTGC GCTTTCTTCA CGGCGATTCG 22 8 0 

GGCCGAGCCA CCGCCTCCCG GTCCTTCGGC CGTGCCCGCT GCAGCCACTG CCGTCGCCGG 2 34 0 

ACCGCAGGCG CCCGAGCCCC CGGCGGCAGC GGCGCAGGGG GAGCCCTGCG GGGGCGCGGG 2 4 00 
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CGGAAGCGCC GCAGGCTGCG GGGGCAGCGC CCCGGGCCCG GCCCCTGCCC CGGCTCCTGC 2 4 60 
CCCGCAGCCG CCCGGCCCGG CCCCGCCAGC CTCGGACAT 24 9 9 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCACTTGTCA ATCAACCCTG CTTCCTTAAT TTTACTGAAG AAGAACTTCT CCAGGATGCT 60 

GGCGCATTTG TAGTACTCGC TCTCGGGAGG GTTGTACTCC TTGCAGTTGG TGAACACTCG 12 0 

TTGCAAGTCC GCCATGAATA ACTTCTTAGA CACATAGTAC CTGTTCCTGA GGCGTTCACT . 18 0 

CATGGTTTTC AGATCCATGG GGAACCTTAT AACTTCATAA TATCCCGGAG CTTCTGTTCT 2 40 

CTTCACTGGT TCCATGAAAG GCCAAGCATT TGGATGGTTC TTCACCTGCT GCAGGATGTT 300 

CTTGAGGGTG CTGTAAACGT GCTCAGGGTC TTTGGGCTCT TTACTTTTCT CTTTTCCACT 3 60 

TGGTTTCCAG CCTGTCTCTC TGATTCCAGG AATGCTTTCT ATAGGAATCT GCCGAACTCC 42 0 

ATCTTTGAAA CACGAAAGTC CAGGGTAGAC TTTTCGAATC TGGGCTTGTT TTCTTTCTAT 4 80 

CAGCTTTTTA ATGATCTCCT TCTGCTTTTT AATGATGACA GAGAACTCTG TGTATGGGAT 540 

CTGAGGGTTC AGCTCACATC CCATCAAAGT GGCCCCTTCA TAATCCTTGA TGTAGCCAAC 600 

ATATTTGGTT TTAGGTATTT TGATTTCTTT GGAGAAACCC TGCTTCTTGA AATAGCCGAT 660 

GGCATACTCA TCTGCATATG TGAGGAAGTT GAGGATCTCG TGCTTTATGT GGTATTCTTT 7 20 

GAGATGGTTC ATCAGGTGGG TTCCATAGCC CTTGACTTGT TCATTTGAGG TTACTGCACA 7 80 

GAAAACAATC TCTGTGAATC CCTGGGATGG AAACATCCGG AAACAGATAC CACCAATGAC 8 40 

ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GCCGTGTGAT 900 

GTACTCTTTG GGCATTCTGG GCAGCTGGTG GGAAAACACA TTCTGGAGGC CCACGAGCCA 960 

CATCAGGATC TTCTTGTTTG GTTTCTGGTT CAGGGAGTTG CCCACCACGT GGAATTCAAT 1020 

GACACCCCTG CGTTCTTCCA GCCGTGCCGC CTCATCTCTG GCCGAATGGG CTGACAGAAA 1080 

ATTGGTCTCT GGTCCAAGCA TCCCTGCAGG GTCTGTGATG GTAGACATGA CCTCATTGAT 1140 

CAATTCCACG GGAATATCCC CCATCACTCG AGATCTCTTG GCCTCCTCGG GAGCATGAGA 12 00 

GTTGTTCATT TTCCTCTTTT CTCCCGGGTT TGCTTCAAGC CCAGAAGAGC CTCTGCATCC 12 60 

AGGACTTGTT CTCCCTCCAT TGATCTGCTC ATGGGAAGTT GAATTTGAAC TGAACAATGC 132 0 

TGTCCCAGTA ACAGGAGGAC TGATTACTGT TTGGATTCCT AGCGGGCTGG TTCTGGAAGA 138 0 

GGCTGAGAGA AAATCCTGAT CCCAGATAGG AGAATTTTGA CTATACACTT CTTCTTCCAA 14 4 0 

CATGGACAGA .AACTTTGGGA AATGTGTGAG GATAAGCGTG CGTTTCTCAA GAGGCAGTTT 1500 

GTCTTTTTTC TGTCTGGCTT GTTCCAAGAG CTGTCGTCTC ATGATGGTGA AGACCGAGCG 1560 

AAGCAATGTT CTCCCAAACA CCTTTGTGGT TTCGTACCGA GGTAAGCTGT CACAGAACTG 162 0 

CGGTACATTG CAGTAGCACA ACCACCTTGT GTAGTTTTCC TTGTATCCAG AGATGTCATC 168 0 

ATTGGGAGAC CGTAGTCTCC GCTGAGATGG AGCCTCCAGA TGCCAGTAGT TGATGCGGTT 17 4 0 

CAGAAACATC TTGGCCAGCT CGATCGTTGT CTGCCTCTCT TTCGATGGCA AGTGACTAAA 18 0 0 

CTTGTACTGC ACGAAGTTGT TCACACCCTG TTCAATACTG GGCTTCTCAA ATGGCGGCTT 1860 

CTTCTCCAAG GAGCCTTCAA CCACAGGTTT TCCTCTTTGT AAAATTGACT TTCTCAAGAG 192 0 

CTTGAATAGG TAGAAGTACA CTTGTTTGGT ATCTGCATCT TCTTCTTTGT GGACGCAGGT 198 0 

GAAGAGGTAC TCCACATCCA ACACAATTCC CAGGAGTCTG TCCATCTCTT CCTCTGACAC 2 04 0 

ATTCTCCAAG TGAGAAACGT GAGCAGCAAG. GGCATGGCTA CAGCTTCGAC AGGATTCTGT 210 0 

CAAACTGACA ATTATCTGCT GGAGGTCTCC TCTTGGTGGA GTAGGAGAGG GGTTAGGGTT 2160 

CTTCCAGCCA TTGCATTTAC AGGACTCCTC TGCCTTGCAG GCGGAGTACA CGCCGAGTTT 222 0 

CTCCAGCTTC TTCGCCCGCG GAGCAGAGCG CAACTGCGCC TTCTTCACGG CGATCCGGGC 228 0 

CGAGCCGCCT CCTCCCGGTC CCTCGGCGGT GCCCGCCGCG GCCACCGGCG TCGCTGGCCC 234 0 

GCAGGAAGCA GAGCTCCCGG CAGCGGTGGC CAGGGTCCGG GGGGAA::CGT GCGGGGGCGC 2 400 

GGGAGGCAGT GCTGGGGACC CGGCCCCGCC AGCCTCGGCC AT 2 4 42 

(2 1 INFORMATION FOR SEQ ID NO: 16: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 6 : 

CCCGCCAGCC TCGGACATGC 2 0 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
{A} LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xa) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CCCGCCAGCC TCGGCCATGC 2 0 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATGGCCGAGG CTGGCGGGGC CGGGTCCCCA GCACTGCCTC CCGCGCCCCC GCACGGTTCC 6C 
CCCCGGACCC TGGCCACCGC TGCCGGGAGC TCTGCTTCCT GCGGGCCAGC GACGCCGGTG 12 0 

GCCGCGGCGG GCACCGCCGA GGGACCGGGA GGAGGCGGCT CGGCCCGGAT CGCCGTGAAG 180 
AAGGCGCAGT TGCGCTCTGC TCCGCGGGCG AAGAAGCTGG AGAAACTCGG CGTGTACTCC 2 40 

GCCTGCAAGG CAGAGGAGTC CTGTAAATGC AATGGCTGGA AGAACCCTAA CCCCTCTCCT 3 00 

ACTCCACCAA GAGGAGACCT CCAGCAGATA ATTGTCAGTT TGACAGAATC CTGTCGAAGC 3 60 

TGTAGCCATG CCCTTGCTGC TCACGTTTCT CACTTGGAGA ATGTGTCAGA GGAAGAGATG 42 0 

GACAGACTCC TGGGAATTGT GTTGGATGTG GAGTACCTCT TCACCTGCGT CCACAAAGAA 4 80 

GAAGATGCAG ATACCAAACA AGTGTACTTC TACCTATTCA AGCTCTTGAG AAAGTCAATT 54 0 

TTACAAAGAG GAAAACCTGT GGTTGAAGGC TCCTTGGAGA AGAAGCCGCC ATTTGAGAAG 60 0 

CCCAGTATTG AACAGGGTGT GAACAACTTC GTGCAGTACA AGTTTAGTCA CTTGCCATCG 6 60 

AAAGAGAGGC AGACAACGAT CGAGCTGGCC AAGATGTTTC TGAACCGCAT CAACTACTGG 72 0 

CATCTGGAGG CTCCATCTCA GCGGAGACTA CGGTCTCCCA ATGATGACAT CTCTGGATAC 780 
AAGGAAAACT ACACAAGGTG GTTGTGCTAC TGCAATGTAC CGCAGTTCTG TGACAGCTTA 8 40 

CCTCGGTACG AAACCACAAA GGTGTTTGGG AGAACATTGC TTCGCTCGGT CTTCACCATC 90 0 

ATGAGACGAC AGCTCTTGGA ACAAGCCAGA CAGAAAAAAG ACAAACTGCC TCTTGAGAAA 96 0 

CGCACGCTTA TCCTCACACA TTTCCCAAAG TTTCTGTCCA TGTTGGAAGA AGAAGTGTAT 102 0 

AGTCAAAATT CTCCTATCTG GGATCAGGAT TTTCTCTCAG CCTCTTCCAG AACCAGCCCG 10 8 0 

CTAGGAATCC AAACAGTAAT CAGTCCTCCT GTTACTGGGA CAGCATTGTT CAGTTCAAAT 1140 

TCAACTTCCC ATGAGCAGAT CAATGGAGGG AGAACAAGTC CTGGATGCAG AGGCTCTTCT 12 0 0 

GGGCTTGAAG CAAACCCGGG AGAAAAGAGG AAAATGAACA ACTCTCATGC TCCCGAGGAG 12 6 0 

GCCAAGAGAT CTCGAGTGAT GGGGGATATT CCCGTGGAAT TGATCAATGA GGTCATGTCT 132 0 

ACCATCACAG ACCCTGCAGG GATGCTTGGA CCAGAGACCA ATTTTCTGTC AGCCCATTCG 13 8 0 

GCCAGAGATG AGGCGGCACG GCTGGAAGAA CGCAGGGGTG TCATTGAATT CCACGTGGTG 14 4 0 

GGCAACTCCC TGAACCAGAA ACCAAACAAG AAGATCCTGA TGTGGCTCGT GGGCCTCCAG 15 00 

AATGTGTTTT CCCACCAGCT GCCCAGAATG CCCAAAGAGT ACATCACACG GCTCGTCTTT 15 60 

GACCCGAAAC ACAAAACCCT TGCTTTAATT AAAGATGGCC GTGTCATTGG TGGTATCTGT 162 0 

TTCCGGATGT TTCCATCCCA GGGATTCACA GAGATTGTTT TCTGTGCAGT AACCTCAAAT 168 0 

•GAACAAGTCA AGGGCTATGG AACCCACCTG ATGAACCATC TCAAAGAATA CCACATAAAG 17 4 0 

CACGAGATCC TCAACTTCCT CACATATGCA GATGAGTATG CCATCGGCTA TTTCAAGAAG 18 00 

CAGGGTTTCT CCAAAGAAAT CAAAATACCT AAAACCAAAT ATGTTGGCTA CATCAAGGAT 18 60 

TATGAAGGGG CCACTTTGAT GGGATGTGAG CTGAACCCTC AGATCCCATA CACAGAGTTC 192 0 

TCTGTCATCA TTAAAAAGCA GAAGGAGATC ATTAAAAAGC TGATAGAAAG AAAACAAGCC 198 0 

CAGATTCGAA AAGTCTACCC TGGACTTTCG TGTTTCAAAG ATGGAGTTCG GCAGATTCCT 204 0 
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ATAGAAAGCA TTCCTGGAAT CAGAGAGACA GGCTGGAAAC CAAGTGGAAA AGAGAAAAGT 
AAAGAGCCCA AAGACCCTGA GCACGTTTAC AGCACCCTCA AGAACATCCT GCAGCAGGTG 
AAGAACCATC CAAATGCTTG GCCTTTCATG GAACCAGTGA AGAGAACAGA AGCTCCGGGA 
TATTATGAAG TTATAAGGTT CCCCATGGAT CTGAAAACCA TGAGTGAACG CCTCAGGAAC 
AGGTACTATG TGTCTAAGAA GTTATTCATG GCGGACTTGC AACGAGTGTT ^CACCAACTGC 
AAGGAGTACA ACCCTCCCGA GAGCGAGTAC TACAAATGCG CCAGCATCCT GGAGAAGTTC 
TTCTTCAGTA AAATTAAGGA AGCAGGGTTG ATTGACAAGT GA 
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What is claimed is: 

1. A purified protein designated P/CAF having a molecular weight of about 93,000 
daltons as determined by sodium dodecyl sulfate polyacrylamide gel electrophoresis 
under reducing conditions and which acetylates histones. 

2. The protein of claim 1 consisting of the amino acid sequence of SEQ ID NO: 1 

3. The protein of claim 1 comprising the amino acid sequence of SEQ ID NO: 2. 

4. The protein of claim 1, which also binds to the amino acid sequence of SEQ ED 
NO:3 on a p300 cellular protein and to amino acid residues 1805-1854 of a CBP cellular 
protein (SEQ ID NO:9), 

5. A fragment of the protein of claim 1 having histone acetyltransferase activity 

6. A polypeptide consisting of the amino acid sequence of SEQ ID NO: 2 

7. A fragment of the protein of claim 1 which binds to the amino acid sequence of 
SEQ ID NO: 3 on the p300 cellular protein and the amino acid sequence of SEQ ED 
N0:9 on the CBP cellular protein. 

8. A polypeptide consisting of the amino acid sequence of SEQ ID NO 4. 

9. A nucleic acid consisting of the nucleotide sequence of SEQ ED NO: 10. 

10. A nucleic acid having a nucleotide sequence which encodes the protein of claim 

1. 
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11. A nucleic acid having a nucleotide sequence which encodes the protein of claim 
2 

12. A nucleic acid having a nucleotide sequence which encodes the protein of claim 

3. 

13. A nucleic acid consisting of the nucleotide sequence which encodes the protein 
of claim 4. 

14. A nucleic acid complementary to and which selectively hybridizes with the 
nucleic acid of claim 1 1 under stringent hybridization conditions. 

15. A fragment of the nucleic acid of claim 9, which encodes a polypeptide that 
acetylates histones. 

16. A fragment of the nucleic acid of claim 9, which encodes a polypeptide which 
binds to the amino acid sequence of SEQ ID NO: 3 on the p300 cellular protein and the 
amino acid sequence of SEQ ID NO:9 on the CBP cellular protein. 

17. A purified antibody which specifically binds the protein of claim 1 

18. A purified antibody which specifically binds the protein of claim 2. 

19. A purified antibody which specifically binds the protein of claim 3. 

20 A purified antibody which specifically binds the protein of claim 4. 

21 An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of P/CAF comprising: 

a) contacting the substance with a system in which histone acetylation by 
P/CAF can be determined; 
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b) determining the amount of histone acetylation by P/CAF in the 
presence of the substance, and 

c) comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of P/CAF. 

22. An assay for screening substances for the ability to inhibit binding of P/CAF to 
p300/CBP comprising: 

a) contacting the substance with a system in which the P/CAF binding of 
P300/CBP can be determined; 

b) determining the amount of P/CAF binding of p300/CBP in the presence of 
the substance; and 

c) comparing the amount of binding of P/CAF to p300/CBP in the presence of 
the substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/CAF to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 

23. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the p300 protein comprising amino acid residues 
1767-1816 (SEQ ID NO. 3) and the protein of claim 4 

24. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising amino acid residues 
1805-1854 (SEQ ID NO: 9) and the protein of claim 4. 

25. The method of claim 22, wherein the system consists of a cell extract produced 
from cells producing both p300 and P/CAF. 
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26. An assay for screening substances for the ability to inhibit or stimulate the 
histone acety [transferase activity of p300/CBP comprising: 

a) contacting the substance with a system in which histone acetylation by 
p300/CBP can be determined; 

b) determining the amount of histone acetylation by p300/CBP in the 

presence of the substance; and 

c) comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 

27. An assay for screening substances for the ability to inhibit binding of a DNA- 
binding transcription factor to p300/CBP comprising: 

a) contacting the substance with a system in which the DNA-binding 
transcription factor binding of P300/CBP can be determined, 

b) determining the amount of DNA-binding transcription factor binding of 
p300/CBP in the presence of the substance; and 

c) comparing the amount of binding of DNA-binding transcription factor to 
p300/CBP in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 

^' binding of DNA-binding transcription factor to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

28. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a DNA-binding transcription factor and p300/CBP. 



29. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising a DNA-binding 
transcription factor and p300/CBP. 
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30. The method of claim 27, wherein the system consists of a cell extract produced 
from cells producing both a DNA-binding transcription factor and p300/CBP. 

31. The method of claim 27, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YYl, Sap- la, c-Fos, MyoD and SRC-l. 

32. A method for inhibiting the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
inhibiting amount of a substance in a pharmaceutically acceptable carrier. 

33. The method of claim 32, wherein the substance can inhibit the transcription 
modulating activity of P/CAF by preventing the binding of P/CAF to p300/CBP. 

34. A method for stimulating the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
stimulating amount of a substance in a pharmaceutically acceptable carrier 

-35. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by promoting the binding of P/CAF to p300/CBP. 

36. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by stimulating the histone acetlytransferase activity of 



P/CAF. 



37. A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyhransferase activity 
inhibiting amount of a substance in a pharmaceutically acceptable carrier 
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38. The method of claim 37, wherein the substance can inhibit the transcription 
modulating activity of p300/CBP by preventing the binding of a DNA-binding 
transcription factor to p300/CBP. 

39. The method of claim 38, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YYl, Sap- la, c-Fos, MyoD and SRC-l. 

40. The method of claim 37, wherein the substance is an antibody which binds 
p300/CBP. 

41 . A method for stimulating the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyltransferase activity 
stimulating amount of a substance in a pharmaceutically acceptable carrier. 

42. The method of claim 41, wherein the substance can stimulate the histone 
acetyltransferase activity of p300/CBP by promoting the binding of a DNA-binding 
transcription factor to p300/CBP. 




Fig.l 
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P300/CBP-ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/CAF AND USES THEREOF 
BACKGROUND OF THE INVENTION 

5 

Field of the Invention 

The present invention provides a transcriptional co-factor, p300/CBP-associated 
factor (P/CAF), which modulates transcription through binding to the cellular 
transcription co-factors p300 and CBP and through acetylation of histones. Also 
10 provided are methods for screening for the presence of P/CAF and for substances which 
alter the transcription modulating effect and growth regulatory activity of P/CAF. 

Background Art 

Cellular proteins p300 and CBP are global transcriptional coactivators that are 
15 involved in the regulation of various DNA-binding transcriptional factors (Janknecht and 
Hunter, 1996). Recently, p300 was found to be very closely related to CBP, a factor 
that binds selectively to the protein kinase A-phosphorylated form of CREB (3-5). 
Cellular factors p300 and CBP exhibit strong amino acid sequence similarity and share 
the capacity to bind both CREB and El A (6-8). Although neither p300 nor CBP by 
20 itself binds to DNA, each can be recruited to promoter elements via interaction with 
sequence-specific activators and functions to be a transcriptional adaptor. For 
simplicity, p300 and CBP will be termed p300/CBP in the context of discussing their 
shared functional properties. 

25 p300/CBP is a large protein consisting of over 2,400 amino acids, known to 

interact with a variety of DNA-binding transcriptional factors including nuclear hormone 
receptors (13,57), CREB (3,4, 7), c-Jun/v-Jun (9,1 1), YYl (10), c-Myb/v-Myb (12,58), 
Sap- la (59), c-Fos (1 1) and MyoD (60). DNA-binding factors recruit p300/CBP not 
only by direct but also indirect interactions through cofactors, for example, nuclear 

30 hormone receptors recruit p300/CBP directly as well as through indirect interactions, via 
SRC-1, which stimulates transcription by binding to various nuclear hormone receptors 
(13,61). 
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The transforming proteins encoded by adenovirus and several other small DNA 
tumor viruses disturb host cell growth control by interacting with cellular factors that 
normally function to repress cell proliferation. One of the most intensively studied of 
these viral proteins, the product of the adenovirus El A gene, is itself sufficient for 
5 transformation (1). El A transforming activity resides in two distinct domains, the 
targets of which include p300/CBP and products of the retinoblastoma (RB) 
susceptibility gene family (1,2). Interactions of El A with p300/CBP and RB are 
thought to influence functionally distinct growth regulatory pathways, allowing the two 
domains to contribute additively to transformation (1). 

10 ' ' 

The paradigm for how El A and functionally related viral proteins perturb cell 
growth regulation derives in large part from studies on their interactions with RB (1,2). 
The molecular function of El A is based on its capacity to interfere with cellular protein- 
protein interactions. Since both El A and various cellular targets bind to a site in RB 

15 termed the pocket domain (2), El A can competitively disrupt the complex formation 
between RB and its cellular targets. 

The second cellular factor implicated in El A-dependent transformation, p300, is 
believed to inhibit GO/Gl exit, to activate certain enhancers, and to stimulate 
20 differentiation (1,2). El A inhibits the p300/CBP-mediated transcriptional activation of 
many promoters (14). In one case that has been examined, the complex of p300 and 
YYl, El A inhibits transcription without disrupting the complex (10). 

The present invention provides a cellular protein designated P/CAF which binds 
25 to p300/CBP and plays an important role in both transcription and cell cycle regulation 
associated with a histone acetyltransferase activity. The present invention also provides 
a histone acetyltransferase activity in the p300/CBP cellular protein, thus providing 
targets for modulating transcription and cell cycle regulation in cells. 



30 
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SUMMARY OF THE INVENTION 



The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
5 polyacrylamide gel electrophoresis under reducing conditions and which acetylates 
histones and which also binds to the p300/CBP cellular protein. 

The present invention further provides a nucleic acid encoding the P/CAF 
protein as well as a vector containing the nucleic acid and a host for the vector. A 
10 purified antibody which specifically binds the P/CAF protein is also provided. 



In addition, also provided is a bioassay for screening substances for the ability to 
inhibit the transcription modulating activity of P/CAF and/or histone acetyltransferase 
activity, comprising contacting the substance with a system in which histone acetylation 
15 by P/CAF can be determined; determining the amount of histone acetylation by P/CAF 
in the presence of the substance, and comparing the amount of histone acetylation by 
P/CAF in the presence of the substance with the amount of histone acetylation by 
P/CAF in the absence of the substance, a decreased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit the 
. 20 transcription modulating activity and/or histone acetyltransferase activity of P/CAF. 

Furthermore, the present invention provides a bioassay for screening substances 
for the ability to inhibit the transcription modulating activity and/or histone 
acetyltransferase activity of P/CAF comprising contacting the substance with a system in 

25 which the p300 binding of P/CAF can be determined; determining the amount of p300 
binding of P/CAF in the presence of the substance; and comparing the amount of p300 
binding of P/CAF in the presence of the substance with the amount of p300. binding of 
P/CAF in the absence of the substance, a decreased amount of p300 binding of P/CAF in 
the presence of the substance indicating a substance that can inhibit the transcription 

30 modulating activity and/or histone acetyltransferase activity of P/CAF. 
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Also provided is a method for determining the amount of P/CAF in a biological 
sample comprising contacting the biological sample with a polypeptide comprising the 
amino acid sequence of SEQ ID NO:3 under conditions whereby a P/CAF/p300 
complex can be formed, and determining the amount of the P/CAF/p300 complex, the 
5 amount of the complex indicating the amount of P/CAF in the sample. 

The present invention additionally provides a method for determining the amount 
of P/CAF in a biological sample comprising contacting the biological sample with an 
antibody which specifically binds P/CAF under conditions whereby a P/CAF/antibody 
10 complex can be formed; and determining the amount of the P/CAF/antibody complex, 
the amount of the complex indicating the amount of P/CAF in the sample. 

Also provided herein is an assay for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of P/CAF, comprising; contacting the 

15 substance with a system in which histone acetylation by P/CAF can be determined; 
determining the amount of histone acetylation by P/CAF in the presence of the 
substance; and comparing the amount of histone acetylation by P/CAF in the presence of 
the substance with the amount of histone acetylation by P/CAF in the absence of the 
substance, a decreased or increased amount of histone acetylation by P/CAF in the 

20 presence of the substance indicating a substance that can inhibit or stimulate, 
respectively, the histone acetyltransferase activity of P/CAF. 

The present invention further provides an assay for screening substances for the 
ability to inhibit binding of P/CAF to p300/CBP comprising: contacting the substance 

25 with a system in which the P/CAF binding of P300/CBP can be determined; determining 
the amount of P/CAF binding of p300/CBP in the presence of the substance; and 
comparing the amount of binding of P/CAF to p300/CBP in the presence of the 
substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/CAF to p300/CBP in the presence of the 

30 substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 



wo 98/03652 



PCT/US97/12877, 



5 

In addition, an assay is provided for screening substances for the ability to inhibit 
or stimulate the histone acetyltransferase activity of p300/CBP, comprising: contacting 
.the substance with a system in which histone acetylation by p300/CBP can be 
5 determined, determining the amount of histone acetylation by p300/CBP in the presence 
of the substance; and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
10 stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 

Furthermore, the present invention provides an assay for screening substances 
for the ability to inhibit binding of a DNA-binding transcription factor to p300/CBP 
comprising: contacting the substance with a system in which the DNA-binding 

15 transcription factor binding of P300/CBP can be determined, determining the amount of 
DNA-binding transcription factor binding of p300/CBP in the presence of the substance; 
and comparing the amount of binding of DNA-binding transcription factor to p300/CBP 
in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 

20 binding of DNA-binding transcription factor to p300/CBP in the presence of the 

substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

A method is also provided for inhibiting the transcription modulating activity of 
25 P/CAF in a subject, comprising administering to the subject a transcription modulating 
activity inhibiting amount of a substance in a pharmaceutical^ acceptable carrier. 

Also provided in the present invention is a method for stimulating the 
transcription modulating activity of P/CAF in a subject, comprising administering to the 
30 subject a transcription modulating activity stimulating amount of a substance in a 
pharmaceutically acceptable carrier. 
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Furthermore, the present invention provides a method for inhibiting the histone 
acetyltransferase activity of p300/CBP in a subject, comprising administering to the 
subject a histone acetyltransferase activity inhibiting amount of a substance in a 
5 pharmaceutically acceptable carrier. 

Finally, the present invention additionally provides a method for stimulating the 
histone acetyltransferase activity of p300/CBP in a subject, comprising administering to 
the subject a histone acetyltransferase activity stimulating amount of a substance in a 
10 pharmaceutically acceptable carrier. 

BRIEF DESCRIPTION OF THE FIGURES 

Figs. lA-B. Fig 1 A: P/CAF-p300/CBP interaction //7 v/vo. Cell extract was 
15 immunoprecipitated with rabbit anti-P/CAF (lanes 1, 4, and 7), rabbit anti-CBP (lanes 2 
and 5), and mouse anti-p300 (lane 9) antibodies. For controls, cell extract was 
precipitated with rabbit control IgG (lanes 3, 6, and 8) or mouse anti-HA monoclonal 
antibody (lane 10). The precipitates were analyzed by immunoblotting with anti-P/CAF 
(lanes 1-3), anti-CBP (lanes 4-6), and anti-p300 (lanes 7-10) antibodies. The positions 
20 of non-specific bands are indicated by asterisks. Fig. IB: El A inhibits the P/CAF-p300 
interaction in vivo. Osteosarcoma cells were transfected with either control vector 
(lanes 1 and 4) or El A- (lanes 2 and 5) or El AAN- (lanes 3 and 6) expression vectors. 
Extract from the transfected subpopulation was immunoprecipitated with anti-P/CAF 
(lanes 1-3) or control (lanes 4-6) IgG. The precipitates were analyzed by 
25 immunoblotting with anti-p300 and anti-P/CAF. 

Figs. 2A-F. P/CAF and El A mediate antagonistic effects on cell cycle 
progression.' HeLa cells (ATCC accession number CCL 2) were transfected by 
electroporation with 7 of P/CAF-expression plasmid and/or 3 ^^g of the full-length or 
30 the N-terminally deleted (A2-36) El A 12S-expression plasmid as indicated in the figure. 
These plasmids were constructed by subcloning FLAG-P/CAF and El A cDNAs into 
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pCX (34) and pcDNAI (Invitrogen), respectively. All samples, in addition, contained 1 

of sorting plasmid (pCMV-IL2R) (31) and carrier . plasmid (pCX) to normalize the 
total amount of DNA to 1 1 //g. After transfection, cells were incubated in Dulbecco's 
modified Eagle's medium with 10% fetal bovine calf serum for 12 hours and 
5 subsequently labeled in medium containing 10 fjM bromo-deoxyuridine (BrdU) for 30 
min. Subsequently, the transfected subpopulation was purified by magnetic affinity cell 
sorting and nuclei were analyzed by dual parameter flow cytometry as described (32). 
Histograms show percentages of cells in Gl and S phases. Abscissa values represent 
fluorescence intensity of bound anti-BrdU antibodies in log scale. 

10 

Fig. 3. Histone acetyltransferase activity of P/CAF. Activity of hGCN5 (lanes 1 
and 4) and P/CAF (lanes 2 and 5) that acetylates free histones (lanes 1-3) or histones in 
the nucleosome core particle (35) (lanes 4-6) was measured as described (36). Each 
reaction contains 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol 
15 of the histone octamer or the nucleosome core particle and 10 pmol of [I-*'*C]acetyl- 
CoA. Note that the histone octamer. dissociates into dimers or tetramers under assay 
conditions. Acetylated histones were detected by autoradiography after separation by 
SDS-PAGE. The bands corresponding to acetylated histones H3 and .H4 are indicated 
byarrows. 

20 

DETAILED DESCRIPTION OF THE INVENTION 

As used in the specification and in the claims, "a" can mean one or more, 
depending upon the context in which it is used. 

25 

P/CAF protein and fragments 

The present invention provides a purified protein designated P/CAF having a 
molecular weight of about 93,000 daltons as determined by sodium dodecyl sulfate 
30 polyacryiamide gel electrophoresis under reducing conditions and which acetylates 
histones. The P/CAF protein can also bind to the amino acid region of SEQ ID NO:3 
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(amino acid (aa) residues 1753 - 1966) of the cellular transcriptional factor, p300 (which 
has the complete amino acid sequence of SEQ ID NO: 6 and the nucleotide sequence of 
SEQ ID NO: 12), and the amino acid region of SEQ ID NO: 6 (amino acid residues 1805 
- 1854) of the cellular transcriptional factor, CBP (which has the complete amino acid 
5 sequence of SEQ ID NO: 7 and the nucleotide sequence of SEQ ID NO: 13). The 
P/C AF protein can be defined by any one or more of the typically used parameters. 
Examples of these parameters include, but are not limited to molecular weight 
(calculated or empirically determined), isoelectric focusing point, specific epitope(s), 
complete amino acid sequence, sequence of a specific region (e.g., N-terminus) of the 
10 amino acid sequence and the like. 

For example. The P/C AF protein can consist of the amino acid sequence of SEQ 
ID NO: 1 or the P/CAF protein can comprise the amino acid sequence of SEQ ID NO:2 
which represents the carboxy terminal end of the P/CAF protein and contains the histone 
15 acetyltransferase activity, or the amino acid sequence of SEQ ID NO:4, which 

represents the amino terminal end of the P/CAF protein, containing the binding site for 
p300/CBP. Because the amino-terminal region is specific for P/CAF it can be used to 
define and identify P/CAF. 

20 As used herein, "purified" refers to a protein (polypeptide, peptide, etc.) that is 

sufficiently free of contaminants or cell components with which it normally occurs to 
..distinguish it ft-om the contaminants or other components of its natural environment. 
The purified protein need not be homogeneous, but must be sufficiently free of 
contaminants to be usefial in a clinical or research setting, for example, in an assay for 

25 detecting antibodies to the protein. Greater levels of purity can be obtained using 
methods derived from well known protocols. Specific methods for purifying P/CAF 
proteins are known in the art. 



30 



As will be appreciated by those skilled in the art, the invention also includes 
those P/CAF polypeptides having slight variations in amino acid sequence which yield 
polypeptides equivalent to the P/CAF protein defined herein. Such variations may arise 
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naturally as allelic variations (e,g,, due to genetic polymorphism) or may be produced by 
human intervention (e.g./hy mutagenesis of cloned DN A. sequences), such as induced 
point, deletion, insertion and substitution mutants. Minor changes in amino acid 
sequence are generally preferred, such as conservative amino acid replacements, small 
5 internal deletions or insertions, and additions or deletions at the ends of the molecules. 
Substitutions may be designed based on, for example, the model of Dayhoff, et ai (37). 
These modifications can result in changes in the amino acid sequence, provide silent 
mutations, modify a restriction site, or provide other specific mutations. 

10 Modifications to any of the P/CAF proteins or fragments can be made, while 

preserving the specificity and activity (function) of the native protein or fi-agment 
thereof As used herein, "native" describes a protein that occurs in nature. The 
modifications contemplated herein can be conservative amino acid substitutions, for 
example,, the substitution of a basic amino acid for a different basic amino acid. 

15 Modifications can also include creation of fiision proteins with epitope tags or known 
recombinant proteins or genes encoding them created by subcloning into commercial or 
non-commercial vectors (e.g., polyhistidine tags, flag tags, myc tag, glutathione-S- 
transferase [GST] fusion protein, xylE flision reporter construct). Furthermore, the 
modifications can be such as do not affect the function of the protein or the way the 

20 protein accomplishes that function (e.g., its secondary structure or the ultimate result of 
the protein's activity). These products are equivalent to the P/CAF protein. The means 
for determining the function, way and result parameters are well known. 

Having provided an example of a purified P/CAF protein, the invention also 
25 enables the purification of P/CAF homologs from other species and allelic variants from 
individuals within a species. For example, an antibody raised against the exemplary 
human P/CAF protein can be used routinely to screen preparations from different 
humans for allelic variants of the P/CAF protein that react with the P/CAF protein- 
specific antibody. Similarly, an antibody raised against an epitope, for example, from a 
30 conserved amino acid region of the human P/CAF protein can be used to routinely 
screen for homologs of the P/CAF protein in other species. A P/CAF protein can be 
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routinely identified in and obtained from other species and from individuals within a 
species using the methods taught herein and others known in the art. For example, 
given the present sequence, the DNA encoding a conserved amino acid sequence can be 
used to probe genomic DNA or DNA libraries of an organism to predictably obtain the 
5 P/CAF gene for that organism. The gene can then be cloned and expressed as the 
P/CAF protein and purified according to any of a number of routine, predictable 
methods. An example of the routine protein purification methods available in the art can 
be found in Pei et al. (38). 

10 A purified polypeptide fragment of the P/CAF protein is also provided. The 

term "fragment" as used herein regarding a P/CAF protein, means a molecule of at least 
five contiguous amino acids of P/CAF protein that has at least one fijnction shared by 
P/CAF protein or a region thereof These functions can include antigenicity, binding 
capacity, acetyltransferase activity and structural roles, among others. The P/CAF 

1 5 fragment can be specific for a recited source. As used herein to describe an amino acid 
sequence (protein, polypeptide, peptide, etc.), "specific" means that the amino acid 
sequence is not found identically in any other source. The determination of specificity is 
made routine by the availability of computerized amino acid sequence databases and 
sequence comparison programs, wherein an amino acid sequence of almost any length 

20 can be quickly and reliably checked for the existence of identical sequences. If an 
- identical sequence is not found, the protein is "specific" for the recited source. For 
^example, a P/CAF fragment can be species-specific (e.g., found in the P/CAF protein of 
humans, but not of other species). 

25 A fragment of the P/CAF protein having histone acetyltransferase activity can 

consist of the amino acid sequence of SEQ ED N0:2 A fragment of the P/CAF protein 
which binds to the amino acid sequence of SEQ ID NO: 3 on p300 and the amino acid 
sequence of SEQ ID NO: 9 on CBP can consist of the amino acid sequence of SEQ ID 
NO:4. To the extent that these fragments are specific for P/CAF; they can be used to 

30 identify and define P/CAF. 
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An antigenic fragment of P/CAF protein is provided. An antigenic fragment has 
an amino acid sequence of at least about five consecutive amino acids of a P/CAF 
protein amino acid sequence and binds an antibody or elicits an immune response in an 
animal. An antigenic fragment can be selected by applying the routine technique of 
5 epitope mapping to P/CAF protein to determine the regions of the proteins that contain 
epitopes reactive with antibodies or are capable of eliciting an immune response in an 
animal. Once the epitope is selected, an antigenic polypeptide containing the epitope 
can be synthesized directly, or produced recombinantly by cloning nucleic acids 
encoding the antigenic polypeptide in an expression system, according to standard 
10 methods. . . 

Alternatively, an antigenic fragment of the antigen can be isolated from the 
whole P/CAF protein or a larger fragment of the P/CAF protein by chemical or 
mechanical disruption. Fragments can also be randomly chosen from a known P/CAF 
15 protein sequence and synthesized. The purified fragments thus obtained can be tested to 
determine their antigenicity and specificity by routine methods. 

Nucleic Acids Encoding P/CAF Protein 

An isolated nucleic acid that encodes a P/CAF protein is also provided. As used 
20 herein, the term "isolated" means a nucleic acid separated or substantially free from at 
least some of the other components of the naturally occurring organism, for example, 
the cell structural components commonly found associated with nucleic acids in a 
cellular environment and/or other nucleic acids. The isolation of nucleic acids can 
therefore be accomplished by techniques such as cell lysis followed by phenol plus 
25 chloroform extraction, followed by ethanol precipitation of the nucleic acids (39). It is 
not contemplated that the isolated nucleic acids are necessarily totally free of all non- 
nucleic acid components or all other nucleic acids, but that the isolated nucleic acids are 
isolated to a degree of purification to be useful in clinical, diagnostic, experimental, or 
other procedures such as, for example, gel electrophoresis. Southern, Northern or dot 
30 blot hybridization, or polymerase chain reaction (PCR). 
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A skilled artisan in the field will readily appreciate that there are a multitude of 
procedures which may be used to isolate the nucleic acids prior to their use in other 
procedures. These include, but are not limited to, lysis of the cell followed by gel 
filtration or anion exchange chromatography, binding DNA to silica in the form of glass 
5 beads, filters or diatoms in the presence of high concentrations of chaotropic salts, or 
ethanol precipitation of the nucleic acids. 

The nucleic acids of the present invention can include positive and negative 
strand RNA as well as DNA and can include genomic and subgenomic nucleic acids 

10 found in the naturally occurring organism. The nucleic acids contemplated by the 
present invention include double stranded and single stranded DNA of the genome, 
complementary -positive stranded cRNA and mRNA, and complementary cDNA 
produced therefrom and any nucleic acid which can selectively or specifically hybridize 
to the isolated nucleic acids provided herein. Stringent conditions (further described 

15 below) are used to distinguish selectively or specifically hybridizing nucleic acids from 
non-selectively and non-specifically hybridizing nucleic acids. 

An isolated nucleic acid that encodes a P/CAF protein can be species-specific 
(i.e., does not encode the P/CAF protein of other species and does not occur in other 
20 species). Examples of the nucleic acids contemplated herein include the nucleic acid of 
SEQ ID NO: 10 as well as the nucleic acids that encode each of the P/CAF proteins or 
fragments thereof described herein. P/CAF proteins and protein fragments can be 
routinely obtained as'described herein and their structure (sequence) determined by 
routine means including the methods as used herein. 

25 

P/CAF protein-encoding nucleic acids can be isolated from an organism in which 
they are normally found (e.g., humans), using any of the routine techniques. For 
example, a genomic DNA or cDNA library can be constructed and screened for the 
presence of the nucleic acid of interest using one of the present P/CAF protein-encoding 
30 nucleic acids as a probe. Methods of constructing and screening such libraries are well 
known in the art and kits for performing the construction and screening steps are 
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commercially available (for example, Stratagene Cloning Systems, La Jolla, CA). Once 
isolated, the nucleic acid can be directly cloned into an appropriate vector, or if 
necessary, be modified to facilitate the subsequent cloning steps. Such modification 
steps are routine, an example of which is the addition of oligonucleotide linkers, which 
5 contain restriction sites^ to the termini of the nucleic acid (See, for example, ref 39). 

P/CAF protein-encoding nucleic acids can also be synthesized. For example, a 
method of obtaining a DNA molecule encoding a specific P/CAF protein is to synthesize 
a recombinant DNA molecule which encodes the P/CAF protein. For example, nucleic 
10 - acid synthesis procedures are routine in the art and oligonucleotides coding for a 

particular protein region are readily obtainable through automated DNA synthesis A 
nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
> the resulting double-stranded molecule has either internal restriction sites or appropriate 
15 ' 5' or 3' overhangs at the termini for cloning into an appropriate vector. 

Oligonucleotides complementary to or identical with the P/CAF protein- 
encoding nucleic acid sequence can be synthesized as primers for amplification 
reactions, such as PCR, or as probes to detect P/CAF protein encoding nucleic acids by 
20 various hybridization protocols (e.g.. Northern blot; Southern blot; dot blot, colony 
screening, etc.)- 

Double-stranded molecules coding for relatively large proteins can readily be 
synthesized by first constructing several difi:erent double-stranded molecules that code 

25 for particular regions of the protein, followed by ligating these DNA molecules together. 
For example, Cunningham, et al. (40), have constructed a synthetic gene encoding the 
human growth hormone by first constructing overlapping and complementary synthetic 
oligonucleotides and ligating these fragments together. See also, Ferretti, et al. (41), 
wherein synthesis of a 1057 base pair synthetic bovine rhodopsin gene fi'om synthetic 

30 oligonucleotides is disclosed. 
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By constaicting a P/CAF protein-encoding nucleic acid in this manner, one 
skilled in the art can readily obtain any particular P/CAF protein with modifications at 
any particular position or positions. See also, U.S. Patent No. 5,503,995 which 
describes an enzyme template reaction method of making synthetic genes. Techniques 
5 such as this are routine in the art and are well documented. DNA encoding the P/CAF 
protein or P/CAF protein fragments can then be expressed in vivo or in vitro. 

The nucleic acid encoding the P/CAF protein can be any nucleic acid that 
functionally encodes the P/CAF protein. To functionally encode the protein (i.e., allow 

10 the nucleic acid to be expressed), the nucleic acid can. include, but is not limited to, 
expression control sequences, such as an origin of replication, a promoter, regions 
upstream or downstream of the promoter, such as enhancers that may regulate the 
transcriptional activity of the promoter, appropriate restriction sites to facilitate cloning 
of inserts adjacent to the promoter, antibiotic resistance genes or other markers which 

1 5 can serve to select for cells containing the vector or the vector containing the insert, and 
necessary information processing sites, such as ribosome binding sites, RNA splice sites, 
polyadenylation sites and transcription termination sequences as well as any other 
sequence which may facilitate the expression of the inserted nucleic acid. 

20 Preferred expression control sequences are promoters derived from 

metallothionine genes, actin genes, immunoglobulin genes, CMV, SV40, adenovirus, 
bovine papilloma virus, etc. A nucleic acid encoding a P/CAF protein can readily be 
determined based upon the genetic code for the amino, acid sequence of the P/CAF 
protein and many nucleic acid sequences will encode a P/CAF protein. Modifications in 

25 the nucleic acid sequence encoding the P/CAF protein are also contemplated. 
Modifications that can be useful are modifications to the sequences controlling 
expression of the P/CAF protein to make production of P/CAF protein inducible or 
repressible as controlled by the appropriate inducer or repressor. Such means are 
standard in the art {see, e.g., ref 39). The nucleic acids can be generated by means 

30 standard in the art, such as by recombinant nucleic acid techniques, as exemplified in the 
examples herein, and by synthetic nucleic acid synthesis or in vitro enzymatic synthesis. 
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After a nucleic acid encoding a particular P/C AF protein of interest, or a region 
of that nucleic acid, is constructed, modified, or isolated, that nucleic acid can then be 
cloned into an appropriate vector, which can direct the in vivo or in vitro synthesis of 
that wild-type and/or modified P/CAF protein. The vector is contemplated to have the 
5 necessary functional elements that direct and regulate .transcription of the inserted 
nucleic acid, as described above. The vector containing the P/CAF nucleic acid or 
nucleic acid firagment can be in a host (e.g., cell or transgenic animal) for expressing the 
nucleic acid. The P/CAF protein or fragment thereof can thus be produced in a host 
system containing the expression vector and its functional activity as described herein 
10 can be demonstrated according to methods well known in the art. 

There are numerous E. coli {Escherichia coli) expression vectors known to one 
of ordinary skill in the art useful for the expression of proteins. Other microbial hosts 
suitable for use include bacilli, such as BaciUus si/htilis, and other enterobacterial such 

15 as Salmonella, Serratia, as well as various Pseudomonas species. These prokaryotic 
hosts can support expression vectors which will typically contain expression control 
sequences compatible with the host cell (e.g., an origin of replication). In addition, any 
number of a variety of well-known promoters will be present, such as the lactose 
promoter system, a tryptophan (Trp) promoter system, a beta-lactamase promoter 

20 system, or a promoter system from phage lambda.. The. promoters will typically control 
expression, optionally with an operator sequence and have ribosome binding site 
sequences, for example, for initiating and completing transcription and translation. If 
necessary, an amino terminal methionine can be provided by insertion of a Met codon 5' 
and in-fi-ame with the gene sequence. Also, the carboxy-terminal extension of the 

25 protein can be removed using standard oligonucleotide mutagenesis procedures. 

Additionally, yeast expression can be used. There are several advantages to 
yeast expression systems. First, evidence exists that proteins produced in yeast secretion 
systems exhibit correct disulfide pairing. Second, post-translational glycosylation is 
30 efficiently carried out by yeast secretory systems. The Saccharomyces cerevisiae pre- 
pro-alpha-factor leader region (encoded by iheMFa-I gene). is routinely used to direct 
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protein secretion from yeast (42). The leader region of pre-pro-alpha-factor contains a 
signal peptide and a pro-segment which includes a recognition sequence for a yeast 
protease encoded by the KEX2 gene. This enzyme cleaves the precursor protein on the 
carboxyl side of a Lys-Arg dipeptide cleavage-signal sequence. The polypeptide coding 
5 sequence can be flised in-frame to the pre-pro-alpha-factor leader-region. This construct 
is then put under the control of a strong transcription promoter, such as the alcohol 
dehydrogenase I promoter or a glycolytic promoter. The protein coding sequence is 
followed by a translation termination codon which is followed by transcription 
termination signals. Alternatively, the polypeptide encoding sequence of interiest can be 
10 fused to a second protein coding sequence, such as Sj26 or P-galactosidase, used to 
facilitate purification of the resultant fusion protein by affinity chromatography. The 
insertion of protease cleavage sites to separate the components of the fusion protein is 
applicable to constructs used for expression in yeast. 

15 Efficient post-translational glycosylation and expression of recombinant proteins 

can also be achieved in Bactdovirns expression systems in insect cells. 

Mammalian cells permit the expression of proteins in an environment that favors 
important post-translational modifications such as folding and cysteine pairing, addition 

20 of complex carbohydrate structures and secretion of active protein. Vectors useful for 
the expression of proteins in mammalian cells are characterized by insertion of the 
protein encoding sequence between a strong viral promoter and a polyadenylation 
signal. The vectors can contain genes conferring either gentamicin or methotrexate 
resistance for use as selectable markers. For example, the antigen and immunoreactive 

25 fragment coding sequence can be introduced into a Chinese hamster ovary (CHO) cell 
line using a methotrexate resistance-encoding vector. Presence of the vector RNA in 
transformed cells can be confirmed by Northern blot analysis and production of a cDNA 
or opposite strand RNA corresponding to the protein encoding sequence can be 
confirmed by Southern and Northern blot analysis, respectively. A number of other 

30 suitable host cell lines capable of secreting intact proteins have been developed in the art 
and include the CHO cell lines, HeLa cells, myeloma cell Hnes, Jurkat cells, and the like. 
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Expression vectors for these cells can include expression control sequences, as described 
. above. The vectors containing the nucleic acid sequences of interest can be transferred 
into the host cell by well-known methods, which vary depending on the type of cell host. 
For example, calcium chloride transfection is commonly utilized for prokaryotic cells, 
5 whereas calcium phosphate treatment or electroporation may be used for other cell 
hosts. 

Alternative vectors for the expression of protein in mammalian cells, similar to 
those developed for the expression of human gamma-interferon, tissue plasminogen 
10 activator, clotting Factor VIII, hepatitis B virus surface antigen, protease Nexin 1, and 
eosinophil major basic protein, can be employed. Further, the vector can include CMV 
promoter sequences and a polyadenylation signal available for expression of inserted 
nucleic acid in mammalian cells (such as COS7). 

15^ - The nucleic acid sequences can be expressed in hosts after the sequences have 

been positioned to ensure the functioning of an expression control sequence. These 
expression vectors are typically replicable in the host organisms either as episomes or as 
an integral part of the host chromosomal DNA. Commonly, expression vectors can 
coTOain- selection markers, e.g., tetracycline resistance or hygromycin resistance, to 
-20 permit detection and/or selection of those cells transformed with the desired nucleic acid 
sequences (see, e.g., U.S. Patent 4,704,362). 

The nucleic acids produced as described above can also be expressed in a host 
which is a non-human animal to create a.transgenic animal, containing, in a germ or 

25 somatic cell, a nucleic acid comprising the coding sequence for all or a portion of the 
P/C AF protein, as well as all of the other regulatory elements required for expression of 
the P/CAF protein-encoding sequence. The animal will express the P/CAF gene or 
portion thereof to produce the P/CAF protein or protein fragment and such expression 
can be detected by determination of a particular phenotype unique to the transgenic 

30 animal expressing the transferred nucleic acid. 
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The nucleic acid can be the nucleic acid of SEQ ID NO: 10, a nucleic acid having 
a nucleotide sequence which encodes the P/CAF protein, a nucleic acid having a 
nucleotide sequence which encodes the protein of SEQ ID N0:1, as well as the nucleic 
acids that encode the proteins comprising the fragments of SEQ ID NOS:2 and 4. 

The nucleic acids of the invention can contain substitutions or deletions which 
provide a particular phenotype of interest. For example, various deletions or base 
•substitutions can be introduced into the nucleic acid encoding the P/CAF protein for the 
purpose of studying the effects of these particular deletions or substitutions on the 
transcription modulation activity of the P/CAF protein. These effects can be monitored 
by observation of such characteristics as growth and development of the animal, the 
ability to develop tumors, survival rates and the like. The gene construct introduced 
into the animal cells to produce the transgenic animal can contain any of the regulatory 
elements described above to modulate expression of the foreign genes. As used herein, 
the term "phenotype" includes morphology, biochemical profiles, changes in tumor 
formation and other parameters that are affected by the presence of the P/CAF protein. 

The transgenic animals of the invention can also be used in a method for 
detwmining the effectiveness of administering a nucleic acid encoding a functional 
P/CAF protein to a subject in need of a functional P/CAF protein. First, 'a nucleic acid 
encoding a nonfunctional P/CAF protein can be introduced into the animal's cells and 
expressed to yield a characteristic phenotype. Then, using standard gene therapy 
techniques, a nucleic acid encoding a functional P/CAF protein can be introduced into 
the animal's cells and the effects on the animal's phenotypic characteristics can be 
determined/ 

Having provided and taught how to obtain a nucleic acid that encodes a P/CAF 
protein, an isolated nucleic acid that encodes a fragment of P 'CAF protein is also 
provided. The nucleic acid encoding the fragment can be obtained using any bf the 
methods applicable to the nucleic acid encoding the entire P/CAF protein. The nucleic 
acid fragment can encode a species-specific P/CAF protein fragment (e.g., found in the 
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P/CAF protein of humans, but not in the P/CAF proteins of other species). Nucleic 
acids encoding species-specific fragments of P/CAF protein are themselves species- 
specific or allele-specific fragments of the P/CAF gene. 

5 Examples of fragments of a nucleic acid encoding a fragment of the P/CAF 

protein can include the nucleic acid sequences which encode the amino acid sequences 
of the fragments of SEQ ED N0S:2 or 4. The same routine computer analyses used to 
select these examples of fragments can be routinely used to obtain others. Fragments of 
P/CAF-encoding nucleic acids can be primers for PCR or probes, which can be species- 
10 specific, gene-specific or allele-specific. P/CAF-encoding nucleic acid fragments can 
encode antigenic or immunogenic fragments of P/CAF protein that can be used in 
- therapeutic assays or screening protocols. P/CAF gene fragments can encode fragments 
of P/CAF protein having histone acetylase activity and/or p300/CBP binding activity as 
described aboye, as well as other uses that may become apparent. 

15 

An isolated nucleic acid of at least ten nucleotides that selectively hybridizes with 
the nucleic acid of SEQ ID NO: 10 under selected conditions is provided. For example, 
the conditions can be PCR amplification conditions and the hybridizing nucleic acid can 
be a primer consisting of a specific fragment of the reference sequence or a nearly 
20. identical nucleic acid that hybridizes only to the exemplified P/CAF-encoding nucleic 
acid or allelic variants thereof 

The invention provides an isolated nucleic acid that selectively hybridizes with 
the P/CAF-encoding nucleic acid sequence of SEQ ID NO: 10 under stringent 

25 conditions. The hybridizing nucleic acid can be a probe that hybridizes only to the 

exemplified P/CAF-encoding nucleic acid sequence. Thus, the hybridizing nucleic acid 
can be a naturally occurring species-specific allelic variant of the exempHfied P/CAF 
gene. The hybridizing nucleic acid can also include insubstantial base substitutions that 
do not prevent hybridization under the stated stringent conditions or affect either the 

30 fiinction of the encoded protein, the way the protein accomplishes that fianction (e.g., its 
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secondary stnjcture) or the ultimate result of the protein's activity. The means for 
determining these parameters are well known. 

As used herein to describe nucleic acids, the term "selectively hybridizes" 
5 excludes the occasional randomly hybridizing nucleic acids as well as nucleic acids that 
encode other known homologs of the P/CAF protein. The selectively hybridizing 
nucleic acids of the invention can have at least 70%, 73%, 78%, 80%, 85%, 88%, 90%, 
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% complementarity with the 
' segment and strand of the sequence to which it hybridizes. This list is not intended to 
10 exclude percent complementarity values between these values. The nucleic acids can be 
at least 10, 15, 16, 17, 18, 20, 21, 23, 24, 25, 30, 35; 40, 50, 100, 150, 200, 300, 500, 
550, 750, 900, 950, or iOOO nucleotides in length or any intervening length, depending 
on whether the nucleic acid is to be used as a primer, probe or for protein expression. 
The hybridizing nucleic acid can comprise a region of at least ten nucleotides (up to full 
15 length) that is completely complementary to a unique region of the nucleic acid to which 
it hybridizes. ' 

The nucleic acid can be an alternative coding sequence for the P/CAF protein, or 
can be used as a probe or primer for detecting the presence of or obtaining the P/CAF 
20' protein. If used as primers, the invention provides compositions including at least two 
nucleic acids which selectively hybridize with different regions of the nucleic acid so as 
to amplify a desired region. Depending on the length of the probe or primer, it can 
range between 70% complementary bases and flill complementarity and still hybridize 
under stringent conditions. 

25 

For example, for the purpose of obtaining or determining the presence of a 
' nucleic acid encoding the P/CAF protein, the degree of complementarity between the 
hybridizing nucleic acid (probe or primer) and the sequence to which it hybridizes 
(P/CAF DNA in a sample) should be at least enough to exclude hybridization with a 
30 nucleic acid from another species. The invention provides examples of these nucleic 

acids of P/CAF, so that the degree of complementarity required to distinguish selectively 
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hybridizing from nonselectively hybridizing nucleic acids under stringent conditions can 
be clearly determined for each nucleic acid. It should also be clear that the hybridizing 
nucleic acids of the invention will not hybridize with nucleic acids encoding unrelated 
proteins (hybridization is selective) under stringent conditions. 

5 

"Stringent conditions" refers to the washing conditions used in a hybridization 
protocol. In general, the washing conditions should be. a combination of temperature 
and salt concentration chosen so that the denaturation temperature is approximately 5- 
20 °C below the calculated T^, of the nucleic acid hybrid under study. The temperature 

10 and salt conditions are readily determined empirically in preliminary experiments in 
which samples of reference DNA immobilized on fikers are hybridized to the probe or 
protein encoding nucleic acid of interest and then washed under conditions of different 
stringencies. For example, the nucleic acid sequence of SEQ ID NO: 10 was used as a 
specific radiolabeled probe for the detection of messenger RNA transcribed from the 

1 5 P/CAF gene by performing hybridizations under stringent conditions. The T^^ of such an 
oligonucleotide can be estimated by allowing 2°C for each A or T nucleotide, and 4°C 
for each G or C. For example, an 18 nucleotide probe of 50% G+C would, therefore, 
have an approximate T^ of 54 °C. 

20 The invention provides an isolated nucleic acid that selectively hybridizes with 

the P/CAF gene shown in the sequence set forth as SEQ ID NO: 10 under stringent 
conditions. The invention further provides an isolated nucleic acid complementary to 
the nucleotide sequence set forth in SEQ ID NO: 10. 

25 Antibodies to the P/CAF protein 

A purified antibody and an antiserum containing polyclonal antibodies that 
specifically bind the P/CAF protein or antigenic fragment are also provided. The term 
, "bind" means the well understood antigen/ antibody binding as well as other nonrandom 
association with an antigen. "Specifically bind" as used herein describes an antibody or 
30 other ligand that does not cross react substantially with any antigen other than the one 
specified, in this case, an antigen of the P/CAF protein. Antibodies can be made as 
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described in Harlow and Lane (33). Briefly, purified P/CAF protein or an antigenic 
fragment thereof can be injected into an animal in an amount and in intervals sufficient to 
elicit a humoral immune response. Serum polyclonal antibodies can be purified directly, 
or spleen cells from the animal can be fused with an immortal cell line and screened for 
5 monoclonal antibody secretion, according to procedures well known in the art. Purified 
monospecific polyclonal antibodies that specifically bind the P/CAF antigen are also 
within the scope of the present invention. The antibodies of the present invention can 
bind the protein of claim 1, the protein of claim 2, the protein of claim 3 and/or the 
protein of claim 4, as well as any other proteins of the present invention. 

10 

A ligand that specifically binds the antigen is also contemplated. The ligand can 
be a fragment of an antibody, such as , for example, an Fab fragment which retains 
P/CAF binding activity, or a smaller molecule designed to bind an epitope of the P/CAF 
antigen. The antibody or ligand can be bound to a substrate or labeled with a detectable 
1 5 moiety or both bound and labeled. The detectable moieties contemplated within the 

compositions of the present invention include those listed above in the description of the 
diagnostic methods, including fluorescent, enzymatic and radioactive markers. 

The antibody can be bound to a solid support substrate or conjugated with a 
20 detectable moiety or therapeutic compound or both bound and conjugated. Such 
conjugation techniques are well known in the art. For example, conjugation of 
fluorescent, radioactive or enzymatic moieties can be performed as described in the art 
(33,43). The detectable moieties contemplated in the present invention can include 
fluorescent, radioactive and enzymatic markers and the like. Therapeutic drugs 
25- contemplated with the present invention can include cytotoxic moieties such as ricin A 
chain, diphtheria toxin, pseudornonas exotoxin and other chemotherapeutic compounds. 

It is well understood by one of skill in the art that all of the above discussion 
regarding antibodies to P/CAF can also be applied with regard to production, 
30 characterization and use of antibodies which bind the p300/CBP protein or any of the 
DNA-binding transcription factors of this invention. 
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Measuring the P/CAF protein in a sample 

The present invention also provides a method for determining the presence and 
thus the amount of P/CAF protein in a biological sample. As used herein, a biological 
sample includes any tissue or cell which would contain the P/CAF protein. Examples of 
5 cells include tissues taken from surgical biopsies, isolated from a body fluid or prepared 
in an in vitro tissue culture environment. 

One example of determining the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 

10 sequence of SEQ ID N0:3 under conditions whereby a P/CAF/p300 complex can be 
formed; and determining the amount of the P/CAF/p300 complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/p300 complex can be accomplished through techniques standard in the art. 
For example, the- complex may be precipitated out of a solution and detected by the 

1 5 addition of a detectable moiety conjugated to the p300 protein or by the detection of an 
antibody which binds p300 or the P/CAF protein, as taught in the Examples herein. 
Antibodies which bind p300 or the P/CAF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of 
P/CAF/p300 complexes by the detection of the binding of antibodies reactive with p300 

20 or the P/CAF protein can be accomplished using various immunoassays as are available 
in the art, as described below. 

Alternatively, determination of the amount of P/CAF in a biological sample can 
comprise contacting the biological sample with a polypeptide comprising the amino acid 

25 sequence of SEQ ID NO: 9 under conditions whereby a P/CAF/CBP complex can be 
formed; and determining the amount of the P/CAF/CBP complex, the amount of the 
complex indicating the amount of P/CAF in the sample. Determination of the amount 
of P/CAF/CBP complex can be accomplished through techniques standard in the art. 
For example, the complex may be precipitated out of a solution and detected by the 

30 addition of a detectable moiety conjugated to the CBP protein or by the detection of an 
antibody which binds either CBP or the P/CAF protein, as taught in the Examples 



wo 98/03652 




PCT/US97/12877 



24 

herein. Antibodies which bind CBP or the P/C AF protein can be either monoclonal or 
polyclonal antibodies and can be obtained as described herein. Detection of P/CAF/CBP 
complexes by the detection of the binding of antibodies reactive with CBP or the P/CAF 
protein can be accomplished using various immunoassays as are available in the art, as 
5 described below. 

Another example of determining the amount of P/CAF in a biological sample 
comprises contacting the biological sample with an antibody which specifically binds 
P/CAF under conditions whereby a P/CAF/ antibody complex can be formed and 
10 determining the amount of the P/C AF/antibody complex, the amount of the complex 
indicating the amount of P/CAF in the sample. Antibodies which bind P/CAF can be 
either monoclonal or polyclonal antibodies and can be obtained as described herein. 
Determination of P/CAF/antibody complexes can be accomplished using various 
immunoassays as are available in the art, as described below. 

15 

Immunoassays such as immunofluorescence assays, radioimmunoassays (RJA), 
immunoblotting and enzyme linked immunosorbent assays (ELISA) can be readily 
adapted for detection and measurement of P/CAF in a biological sample. Both 
polyclonal and monoclonal antibodies can be used in the assays. Available 
20 immunoassays are well known in the art and are extensively described in the patent 
scientific literature. See; for example, U.S. Patent Nos. 3,791,932; 3,839,153; 
3,850,752; 3,850,578, 3,853,987; 3,867,517; 3,879,262; 3,901,654, 3,935,074, 
3,984,533; 3,996,345; 4,034,074; and 4,098,876. 

25 Screening assays for P/CAF 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of P/CAF comprising contacting a 
system", in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 

30 amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
the amount of histone acetylation by P/CAF in the presence of the substance with the 
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amount of histone acetylation by P/CAF in the absence of the substance, a decreased 
amount of histone acetylation by P/CAF in the presence of the substance indicating a 
substance that can inhibit the histone acetyltransferase activity of P/CAF. The 
acetylation of histones by P/CAF can be determined in a system including, for example, 
5 either core histones (histones H2A, H2B, H3 and H4) or the nucleosome core particles 
(146 base pairs of DNA wrapped around the octamer of core histones) as substrates, the 
P/CAF protein and radiolabeled acetyl-CoA (e.g., [l-^'^Cjacetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 
. as described herein in the Examples. Thus, the compound to be tested for the ability to 
10 inhibit the histone acetyltransferase activity of P/CAF can be added to this system and 
assayed for inhibiting ability. 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the transcription modulating activity of P/CAF, comprising contacting a 

1 5 system, in which histone acetylation by P/CAF can be determined, with the substance 
under conditions whereby histone acetylation by P/CAF can occur; determining the 
amount of histone acetylation by P/CAF in the presence of the substance; and comparing 
. the amount of histone acetylation by P/CAF in the presence of the substance with the 
amount of histone acetylation by P/CAF in the absence of the substance, a decreased 

20 amount of histone acetylation by P/CAF in the presence of the substance indicating a 

substance that can inhibit the transcription modulating activity and cell cycle progression 
suppressing activity of P/CAF. The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 

25 octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 
(e.g., [l-^'^Cjacetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
Thus, the compound to be tested for the. ability to inhibit the transcription modulating 
activity of P/CAF by interfering with the histone acetyltransferase activity of P/CAF can 

30 be added to this system and assayed for inhibiting ability. 
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Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 and P/CAF can occur; determining the amount 
5 of p300 binding to P/CAF in the presence of the substance; and comparing the amount 
of p300 binding to P/CAF in the presence of the substance with the amount of p300 
binding to P/CAF in the absence of the substance, a decreased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can inhibit the 
-binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
10 system, for example, which can include a cell free reaction mixture comprising a 

fragment of the p300 protein comprising the amino acid sequence of SEQ ED NO;3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

15 

Additionally provided in the present invention is a bioassay for screening 
substances for the ability to inhibit the binding of CBP to P/CAF, comprising contacting 
a system in which the binding of CBP to P/CAF can be determined, with the substance 
under conditions whereby the binding of CBP to P/CAF can occur; determining the 

20' amount of CBP binding to P/CAF in the presence of the substance; and comparing the 
amount of CBP binding to P/CAF in the presence of the substance with the amount of 
CBP binding to P/CAF in the absence of the substance, a decreased amount of CBP 
binding to P/CAF in the presence of the substance indicating a substance that can inhibit 
the binding of CBP to P/CAF. The binding of CBP to P/CAF can be determined in a 

25 system,' for example, which can include a cell free reaction mixture comprising a 

fragment of the CBP protein comprising the amino acid sequence of SEQ ID NO: 9 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both CBP and P/CAF. Determination of the binding of CBP to* P/CAF can be 
carried out as taught herein. 

30 
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The present invention fijrther contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of P/CAF comprising 
contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
5 the substance; and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 
presence of the substance indicating a substance that can stimulate the histone 
acetyltransferase activity of P/CAF. The acetylation of histones by P/CAF can be 

10 determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl- 
CoA (e.g., [1-^'*C] acetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 

15 Thus, the compound to be tested for the ability to stimulate the histone acetyltransferase 
activity of P/CAF can be added to this system and assayed for stimulating ability. 

The present invention further contemplates a bioassay for screening substances 
for the ability to stimulate the transcription modulating activity of P/CAF comprising 

20 contacting a system, in which histone acetylation by P/CAF can be determined, with the 
substance; determining the amount of histone acetylation by P/CAF in the presence of 
the substance; and comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, an increased amount of histone acetylation by P/CAF in the 

25 presence of the substance indicating a substance that can stimulate the transcription 

modulating activity of P/CAF. The acetylation of histones by P/CAF can be determined 
in a system including, for example, either core histones (histones H2A, H2B, H3 and 
H4) or the nucleosome core particles (146 base pairs of DNA wrapped around the 
octamer of core histones) as substrates, the P/CAF protein and radiolabeled acetyl-CoA 

30 (e.g., [l-^'^CJacetyl CoA). The presence of acetylated histones can be detected by 
autoradiography after separation by SDS-PAGE as described herein in the Examples. 
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Thus, the compound to be tested for thie ability to stimulate the transcription modulating 
activity of P/CAF by increasing the histone acetyltransferase activity of P/CAF can be 
added to this system and assayed for stimulating ability. 

5 The present invention further provides a bioassay for screening substances for 

the ability to stimulate binding of p300 to P/CAF, comprising contacting a system in 
which the binding of p300 to P/CAF can be determined, with the substance under 
conditions whereby the binding of p300 to P/CAF can occur; determining the amount of 
p300 binding to P/CAF in the presence of the substance; and comparing the amount of 

10 p300 binding to P/CAF in the presence of the substance with the amount of p300 

binding to P/CAF in the absence of the substance, an increased amount of p300 binding 
to P/CAF in the presence of the substance indicating a substance that can stimulate the 
binding of p300 to P/CAF. The binding of p300 to P/CAF can be determined in a 
system, for example, which can include a cell free reaction mixture comprising a 

15 fragment of the p300 protein comprising the amino acid sequence of SEQ ID N0:3 and 
P/CAF. Alternatively, the system can comprise a cell extract produced from cells 
producing both p300 and P/CAF. Determination of the binding of p300 to P/CAF can 
be carried out as taught herein. 

Additionally provided in the present invention is a bioassay for screening 
substances for the ability to stimulate the binding of CBP to P/CAF, comprising 
contacting a system in which the binding of CBP to P/CAF can be determined, with the 
substance under conditions whereby the binding of CBP to P/CAF can occur; 
determining the amount of CBP binding to P/CAF in the presence of the substance; and 
comparing the amount of CBP binding to P/CAF in the presence of the substance with 
the amount of CBP binding to P/CAF in the absence of the substance, an increased 
amount of CBP binding to P/CAF in the presence of the substance indicating a 
substance that can stimulate the binding of CBP to P/CAF. The binding of CBP to 
P/CAF can be determined in a system, for example, which can include a cell free 
reaction mixture comprising a fragment of the CBP protein comprising the amino acid 
sequence of SEQ ID NO: 9 and P/CAF. Alternatively, the system can comprise a cell 
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extract produced from ceils producing both CBP and P/CAF. Determination of the 
binding of CBP to P/CAF can be carried out as taught herein. 

Transcription modulating activity of P/CAF 
5 The present invention contemplates a method for inhibiting the transcription 

modulating activity of P/CAF in a subject, comprising administering to the subject a 
transcription modulating activity inhibiting amount of a substance in a pharmaceutically 
acceptable carrier. For example, the substance can be identified according to the 
protocols provided herein as one that can inhibit the transcription modulating activity of 

10 P/CAF by preventing the binding of P/CAF to p300/CBP or by inhibiting the histone 
acetyl transferase activity of P/CAF as well as by any other inhibitory mechanism as 
identified by the protocols provided herein. Inhibition of the transcription modulating 
activity of P/CAF in a subject is desirable, for example, to inhibit HIV TAT-mediated 
transcription and therefore, the method of the present invention can be used to treat 

1 5 HIV-infected subjects. 

The substance can be in a pharmaceutically acceptable carrier. By 
"pharmaceutically acceptable" is meant a material that is not biologically or otherwise 
undesirable, i.e., the material may be administered to a subject, along with the substance, 
20 without causing any undesirable biological effects or interacting in a deleterious manner 
• with any of the other components of the pharmaceutical composition in which it is 
contained. The carrier would naturally be selected to minimize any degradation of the 
active ingredient and to minimize any adverse side eflTects in the subject. 

25 .The transcription modulating activity and/or histone acetyltransferase activity of 

P/CAF can be inhibited in a subject by administering to the subject a substance which 
binds p300/CBP at the P/CAF binding site or a substance which binds the P/CAF 
protein at the p300/CBP binding site, the ultimate result being that P/CAF and 
p300/CBP do not bind with one another and P/CAF cannot exert its transcription 

30 modulating and/or histone acetyltransferase effect. The substance can be a protein, such 
as an antibody which binds the P/CAF protein binding site at or near the p300/CBP 
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binding site, thereby preventing its binding or an antibody which binds the p300/CBP 
protein at or near the P/CAF binding site, thereby preventing its binding. The substance 
can also bind the histone acetyltransferase site on P/CAF or at the acetylation site on the 
histone, thereby preventing acetylation by P/CAF. 

5 

The substance which binds p300/CBP, the P/CAF protein or the histone and has 
the net effect of inhibiting the transcription modulating effect and or histone 
acetyltransferase activity of P/CAF in the cell can be delivered to a cell in the subject by 
mechanisms well known in the art. 

10 

Alternatively, a nucleic acid encoding a protein which binds either to p300/CBP 
or the P/CAF protein and has the net effect of inhibiting the transcription modulating 
effect and/or histone acetyltransferase activity of P/CAF in the cell can be delivered to a 
cell in the subject by gene transduction mechanisms well known in the art. For example, 
15 nucleic acid can be introduced by liposomes as well as via retroviral or adeno-associated 
viral vectors; as described below. 

The substance which inhibits the transcription modulating effect and/or histone 
iacetyltransferase activity of P/CAF can be an antisense RNA or an antisense DNA which 

20 binds the RNA or DNA of P/CAF, thereby preventing translation or transcription of the 
:'RNA or DNA encoding P/CAF and having the net effect of inhibiting the transcription 
.modulating effect and/or histone acetyltransferase activity of P/CAF by inhibiting P/CAF 
production. The antisense RNA of the present invention can be generated from the 
nucleic acid of SEQ ID NO: 14 (human) or SEQ ID NO: 15 (mouse). Furthermore, the 

25 antisense DNA can be a phosphorothioate oligodeoxyribonucleotide having the 

nucleotide sequence of SEQ ID NO: 16 (human) or of SEQ ID NO: 17 (mouse). The 
mouse antisense RNA can be used to inhibit the activity of mouse P/CAF, having the 
nucleotide sequence of SEQ ID NO: 18 and the amino acid sequence of SEQ ID NO:8. 
The present invention also contemplates an antisense nucleic acid sequence which can 

30 bind the DNA or RNA of any of the transcription factors or other proteins now known 
or later identified to bind P/CAF, thereby inhibiting expression of the gene products of 
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these proteins and having the net effect of inhibiting the transcription modulating effect 
and/or histone acetyltransferase activity of P/CAF. 

The antisense nucleic acid can comprise a typical nucleic acid, but the antisense 
5 nucleic acid can also be a modified nucleic acid or a derivative of a nucleic acid such as a 
phosphorothioate analogue of a nucleic acid. The composition can comprise, for 
example, an antisense RNA that specifically binds an RNA encoded by the gene 
encoding the serum protein. Antisense RNAs can be synthesized and used by standard 
methods (62). 

10 

Antisense RNA can inhibit gene expression by forming an RNA/RNA duplex 
between the antisense RNA and the RNA transcribed from the target gene. The precise 
mechanism by which this duplex formation decreases the production of the protein 
encoded by the endogenous gene probably involves binding of complementary regions 

15 of the normal sense mRNA and the antisense RNA strand with duplex formation in a 
manner that blocks RNA processing and translation. Alternative mechanisms include 
the formation of a triplex between the antisense RNA and duplex DNA or the formation 
of an DNA-RNA duplex with subsequent degradation of DNA-RNA hybrids by RNAse 
H. Furthermore, an antigene effect can result from certain DNA-based oligonucleotides 

20 via triple-helix formation between the oligomer and double-stranded DNA which results 
in the repression of gene transcription. Regardless of the specific molecular mechanism, 
the present invention results in inhibition of expression of the P/CAF gene by the 
introduced and replicated DNA resuUing in inhibition of the transcription modulating 
and/or histone acetyltransferase activity of P/CAF, by a reduction in the expression of 

25 the nucleic acid to which the antisense nucleic acid is hybridized, and therefore a 
reduction of the gene product from the targeted gene. 

The antisense nucleic acid may be obtained by any number of techniques known 
to one skilled in the art. One method of constructing an antisense nucleic acid is to 
30 synthesize a recombinant antisense DNA molecule. For example, oligonucleotide 

synthesis procedures are routine in the art and oligonucleotides coding for a particular 
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protein or regulatory region are readily obtainable through automated DNA synthesis. 
A nucleic acid for one strand of a double-stranded molecule can be synthesized and 
hybridized to its complementary strand. One can design these oligonucleotides such that 
the resulting double-stranded molecule has either internal restriction sites or appropriate 
5 5' or 3* overhangs at the termini for cloning into an appropriate vector. Double-stranded 
molecules coding for relatively large proteins or regulatory regions can be synthesized 
by first constructing several different double-stranded molecules that code for particular 
regions of the protein or regulatory region, followed by ligating these DNA molecules 
together. Once the appropriate DNA molecule is synthesized, this DNA can be cloned 
10 downstream of a promoter in an antisense orientation. Techniques such as this are 
routine in the art and are well documented. 

An example of another method of obtaining an antisense nucleic acid is to isolate 
that nucleic acid from the organism in which it is found and clone it in an antisense 

15 orientation. For example, a DNA or cDNA library can be constructed and screened for 
the presence of the nucleic acid of interest. Methods of constructing and screening such 
libraries are well known in the art and kits for perfoi-ming the construction and screening 
steps are commercially available (for example, Stratagene Cloning Systems, La Jolla, 
CA). Once isolated, the nucleic acid can be directly cloned into an appropriate vector in 

20 an antisense orientation, or if necessary, be modified to facilitate the subsequent cloning 
steps. Such modification steps are routine, an example of which is the addition of 
oligonucleotide linkers which contain restriction sites to the termini of the nucleic acid. 
General methods are set forth in Sambrook et al. (39). 

25 The DNA that is introduced into the cell is in an expression orientation that is 

antisense to a corresponding endogenous DNA or RNA of the cells. For example, 
where an endogenous DNA comprises a gene which encodes for a particular protein, the 
introduced DNA is in an expression orientation opposite the expression of the 
endogenous DNA, that is the DNA operatively linked to a promoter is in an antisense 

30 expression orientation relative to the corresponding endogenous gene. The introduced 
DNA may be homologous to the entire transcribed gene or homologous to only part of 
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the transcribed gene. Alternatively, the sequence of the introduced DNA may be 
divergent to that of the endogenous DNA but only divergent to the extent that 
hybridization of the nucleic acids occurs, thereby preventing transcription. One skilled 
in the art can determine the maximum extent of this divergence by routine screening of 
5 antisense DNAs corresponding to an endogenous DNA of the ceil. In this manner, one 
skilled in the art can readily determine which fragments, or alternatively the extent of 
homology of the fragments or the entire gene that is necessary to inhibit gene 
expression. 

10 The antisense nucleic acids of the present invention can be made according to 

protocols standard in the art, as well as described in the Examples provided herein. The 
antisense nucleic acids can be administered to a subject according to the gene 
transduction protocols standard in the art, as described below. 

1 5 The present invention also contemplates a method for stimulating the 

transcription modulating activity and/or histone acetyltransferase. activity of P/CAF in a 
subject comprising administering to the subject a substance, in a pharmaceutically 
acceptable carrier, determined according to the methods taught herein, to have a 
stimulatory affect on the transcription modulating and/or histone acetyltransferase 

20 activity of P/CAF. The substance can be one which has been identified, according to the 
protocols provided herein, to stimulate histone acetyltransferase activity in P/CAF or 
promote binding of P/CAF to p300/CBP. The stimulation of the transcription 
modulation activity and/or histone acetyltransferase activity of P/CAF in a subject is 
desirable, for example, to activate tumor suppressor p53 (which promotes apoptosis) or 

25 to activate the muscle differentiation factor, MyoD. Thus, the method of the present 
invention can be employed to treat cancer and to promote muscle differentiation in 
conditions where muscle differentiation is desired. The substance can be delivered to a 
cell in the subject by mechanisms well known in the art. 

30 Further contemplated in the present invention is a method for promoting binding 

. of P/CAF to p300/CBP in a subject, comprising administering to the subject a substance 
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identified by the methods provided herein to promote binding of P/CAF to either p300 



Additionally, a nucleic acid encoding a protein which stimulates the transcription 
modulating activity and/or histone acetyltransferase activity of P/CAF can be delivered 
to a cell in the subject by gene transduction mechanisms, as described below. 

Also provided in the present invention is a method of inhibiting the cell cycle 
progression inducing effect of an oncoprotein which binds p300/CBP in a subject 
comprising transducing the cells of the subject with a vector comprising a nucleic acid 
encoding the P/CAF protein; inducing expression of the nucleic acid in the cell to 
produce the P/CAF in an amount which will allow the P/CAF gene product to replace 
the oncoprotein bound to p300/CBP, whereby the replacement of the oncoprotein 
bound to p300/CBP by the P/CAF gene product inhibits the cell cycle progression 
inducing effect of the oncoprotein. The oncoprotein which binds p300/CBP in the cell 
can be the adenovirus El A oncoprotein. 

A method for providing a functional P/CAF protein to a subject in need of the 
functional P/CAF protein is also provided, comprising transducing the cells of the 
subject with a vector comprising a nucleic acid encoding the P/CAF protein and 
inducing expression of the nucleic acid to produce the functional P/CAF protein in the 
cell, thereby providing the functional P/CAF protein to the subject. The transduction of 
the vector nucleic acid into the subject's cells can be carried out according to standard 
gene therapy protocols well known in the art (see, for example, U.S. Patent No. 
5,339,346). 

Screrrdng assays for p300/CBP 

The present invention also provides a bioassay for screening substances for the 
ability to inhibit the histone acetyltransferase activity of p300/CBP comprising 
contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance under conditions whereby histone acetylation by p300/CBP can occur; 



or CBP 
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determining the amount of histone acetylation by p300/CBP in the presence of the 
substance; and comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased amount of histone acetylation by p300/CBP in the 
5 presence of the substance indicating a substance that can inhibit the histone 

acetyltransferase activity of p300/CBP. The acetylation of histones by p300/CBP can be 
determined in a system including, for example, either core histones (histones H2A, H2B, 
H3 and H4) or the nucleosome core particles (146 base pairs of DNA wrapped around 
the octamer of core histones) as substrates, the P300/CBP protein and radiolabeled 
10 acetyl-CoA (e.g., [l-^'^CJacetyl Co A). The presence of acetylated histones can be 

detected by autoradiography after separation by SDS-PAGE as described herein in the 
Examples. Thus, the compound to be tested for the ability to inhibit the histone 
acetyltransferase activity of p300/CBP can be added to this system and assayed for 
acetyltransferase inhibiting ability. 

15 

Also provided in the present invention is a bioassay for screening substances for 
the ability to inhibit the binding of a transcriptional factor to p300/CBP, comprising 
contacting a system in which the binding of a transcriptional factor to p300/CBP can be 
determined, with the substance under conditions whereby the binding of the 

20 transcriptional factor and p300/CBP can occur; determining the amount of 

transcriptional factor binding to p300/CBP in the presence of the substance, and 
comparing the amount of transcriptional factor binding to p300/CBP in the presence of 
the substance with the amount of transcriptional factor binding to p300/CBP in the 
absence of the substance, a decreased amount of transcriptional factor binding to 

25 p300/CBP in the presence of the substance indicating a substance that can inhibit the 
binding of a transcriptional factor to p300/CBP. The binding of a transcriptional factor 
to p300/CBP can be determined in a system, for example, which can include a cell free 
reaction mixture comprising a transcriptional factor which binds p300/CBP and 
p300/CBP. Alternatively, the system can comprise a cell extract produced from cells 

30 producing both a transcriptional factor which binds p300/CBP and p300/CBP. The 
transcriptional factor which binds p300/CBP can be selected from, but is not limited to 



wo 98/03652 




PCT/US97/ 12877 



36 

the group consisting of nuclear hormone receptors, CREB, c-Jun/v-Jun, c-Myb/v-Myb, 
YYI, Sap- la, c-Fos, MyoD and SRC-1, as well as any other transcriptional factor now 
known or later identified to bind p300/CBP. The screening assay of the present 
invention can also be used to identify substances which inhibit the binding of p300/CBP 
5 to other components to which it is known. to bind, for example, P/CAF, Pp90rsi^, TFIIB, 
El A, SV40 large T antigen, as well as any other substances now known or later 
identified to bind p300/CBP. Determination of the binding of a transcriptional factor or 
other substance to p300/CBP can be carried out as taught in the Examples herein as well 
' as by protocols described in the literature. 

10 

The present invention fiarther contemplates a bioassay for screening substances 
for the ability to stimulate the histone acetyltransferase activity of p300/CBP comprising 
contacting a system, in which histone acetylation by p300/CBP can be determined, with 
the substance, determining the amount of histone acetylation by p300/CBP in the 

15 presence of the substance; and comparing the amount of histone acetylation by 

p300/CBP in the presence of the substance with the amount of histone acetylation by 
p300/CBP in the absence of the substance, an increased amount of histone acetylation 
by p300/CBP in the presence of the substance indicating a substance that can stimulate 
the histone acetyltransferase activity of p300/CBP. The acetylation of histones by 

20 p300/CBP can be determined in a system including, for example, either core. histones 
(histones H2A, H2B, H3 and H4) or the nucleosome core particles (146 base pairs of 
DNA wrapped around the octamer of core histones) as substrates, the p300/CBP 
protein and radiolabeled acetyl-CoA (e.g., [l-^'*C]acetyl CoA). The presence of 
acetylated histones can be detected by autoradiography after separation by SDS-PAGE 

25 as described herein in the Examples. Thus, the compound to be tested for the ability to 
stimulate the histone acetyhransferase activity of p300/CBP can be added to this system 
and assayed for stimulating ability. 



30 



The present invention fijrther provides a bioassay for screening substances for 
the ability to stimulate binding of a component, which binds p300/CBP, to p300/CBP, 
comprising contacting a system in which the binding of the component to p300/CBP can 
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be determined, with the substance under conditions whereby the binding of the 
. component to p300/CBP can occur; determining the amount of component binding to 
p300/CBP in the presence of the substance; and comparing the amount of component 
binding to p300/CBP in the presence of the substance with the amount of component 
5 binding to p300/CBP in the absence of the substance, an increased amount of 

component binding to p300/CBP in the presence of the substance indicating a substance 
that can stimulate the binding of the component to p300/CBP. The binding of the 
component to p300/CBP can be determined in a system, for example, which can include 
a cell free reaction mixture comprising the component and p300/CBP. Alternatively, the 

10 system can comprise a cell extract produced from cells producing both the component 
and p300/CBP. The component which binds p300/CBP can be. any of the transcriptional 
factors or other proteins which are known or are identified in the future to bind 
p300/CBP, as set forth above. Determination of the binding of the component to 
p300/CBP can be carried out as taught in the Examples provided herein and according 

1 5 to protocols available in the literature. 

Ilistone acetyltransferase activity of p300/CBP 

A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject is provided in the present invention, comprising administering to the subject a 
20 histone acetyhransferase activity inhibiting amount of a substance in a pharmaceutically 
acceptable carrier. The mechanism of the inhibitory action of the substance can be the 
inhibition of the binding of a DNA-binding transcription factor, such as, for example, a 
nuclear hormone receptor, CREB, c-Jun/v-Jun, c-Myb/v-Myb, YYl, Sap- la, c-Fos, 
MyoD or SRC-1, to p300/CBP. 

25 

The histone acetyhransferase activity of p300/CBP can be inhibited in a subject 
by administering to the subject a substance which binds p300/.CBP at the transcription 
factor binding site or a substance which binds the transcription factor protein at the 
p300/CBP binding site, the ultimate result being that the transcription factor and 
30 p300/CBP do not bind with one another and p300/CBP cannot acetylate histones. 
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The substance which binds either to the transcription factor or the p300/CBP 
protein and has the net effect of inhibiting the histone acetyltransferase activity of 
p300/CBP in the cell can be identified according to the screening methods provided 
herein and delivei-ed to a cell in the subject by mechanisms well known in the art. The 
5 substance can be a protein, such as an antibody which binds the p300/CBP protein 
binding she at or near the DNA-binding transcription factor binding site, thereby 
preventing its binding or an antibody which binds the DNA-binding transcription factor 
at or near the p300/CBP binding site, thereby preventing its binding. The substance can 
also bind the histone acetyltransferase site on p300/CBP (aa 1 195-1673 on p300 or aa 
10 1 174-1850 on CBP) or at the acetylation site on the histone, thereby preventing 
acetylation by p300/CBP. 

Additionally, the substance can be a nucleic acid which can be expressed in the 
cell to produce a protein which inhibits the histone acetyltransferase activity of 

15 p300/CBP. For example, a nucleic acid encoding a protein which binds either to a 
transcription factor or the p300/CBP protein and has the net effect of inhibiting the 
histone acetyltransferase activity of p300/CBP in the cell can be delivered to a cell in the 
subject by gene transduction mechanisms well known in the art. For example, nucleic 
ackl can be introduced by hposomes as well as via retroviral or adeno-associated viral 

20 vectors, as described below. 

The substance which inhibits the histone acetyltransferase activity of p300/CBP 
can be an antisense RNA or an antisense DNA which binds the RNA or DNA of 
p300/CBP thereby preventing translation or transcription of the RNA or DNA encoding 
25 p300/CBP and having the net effect of inhibiting the histone acetyltransferase activity of 
P/CAF by inhibiting p300/CBP production. The antisense RNA or DNA of the present 
invention can be produced and introduced into cells according to the same methods as 
set forth above for P/CAF antisense nucleic acids. 

30 The present invention also contemplates a method for stimulating the histone 

acetyltransferase activity of p300/CBP in a subject comprising administering to the 
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subject a histone acetyltransferase activity stimulating amount of a substance, in a 
pharmaceutically acceptable carrier, determined according to the methods taught herein, 
to have a stimulatory affect on the histone acetyltransferase activity of p300/CBP. The 
substance can exert a stimulatory effect by promoting the binding of a DNA-binding 
5 transcription factor of the present invention to p300/CBP. The substance can be 
delivered to a cell in the subject by mechanisms well known in the art. A nucleic acid 
encoding a protein which stimulates the transcription modulating activity of p300/CBP 
can be delivered to a cell in the subject by gene transduction mechanisms, as described 
below. 

10 , 

Gene transduction 

In the methods described above which include gene transduction into cells (i.e., 
addition of exogenous DNA into cells), the nucleic acids of the present invention can be 
in a vector for delivering the nucleic acids to the site for expression of the P/CAF 

15 protein. The vector can be one of the commercially available preparations, such as the 
: pGM plasmid (Promega). Vector delivery can be by liposome, using commercially 
available liposome preparations or newly developed liposomes having the features of the 
present liposomes. Additionally, vector delivery can be via a viral system, including, but 
not limited to, retroviral, adenoviral and adeno-associated viral systems. Other delivery 

20 methods can be adopted and routinely tested according to the methods taught herein. 

The modes of administration of the liposome will vary predictably according to 
the disease being treated and the tissue being targeted. For example, for treating cancer 
in either the lung or the liver, which are both sinks for liposomes, intravenous delivery is 

25 reasonable. For other localized cancers, as well as precancerous conditions, 

catheterization of an artery upstream from the target organ is a preferred mode of 
delivery, because it avoids significant clearance of the liposome by the lung and liver. 
For cancerous lesions at a number of other sites (e.g., skin cancer, localized dysplasias), 
topical delivery is expected to be effective and may be preferred, because of its 

30 convenience. 
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Leukemias and other disorders involving dysregulated proliferation of certain 
isolatable cell populations may be more readily treated by ex vivo administration of the 
nucleic acid. 

5 The liposomes may be administered topically, parenterally (e.g., intravenously), 

by intramuscular injection, by intraperitoneal injection, transdermally, extracorporeally 
or the like, although intravenous or topical administration is typically preferred. The 
exact amount of the liposomes required will vary from subject to subject, depending on 
the species, age, weight and general condition of the subject, the severity of the disease 
10 being treated, the particular compound used, its mode of administration and the like. 
Thus, it is not possible to specify an exact amount. However, an appropriate amount 
may be determined by one of ordinary skill in the art using only routine experimentation 
given the teachings herein. 

1 5 Parenteral administration, if used, is generally characterized by injection. 

Injectables can be prepared in conventional forms, either as liquid solutions' or 
suspensions, solid forms suitable for solution of suspension in liquid prior to injection, or 
as emulsions. A more recently revised approach for parenteral administration involves 
use of a slow release or sustained release system such that a constant level of dosage is 

20 maintained. See, e.g., U.S. Patent No. 3,6 10,795, "which is incorporated by reference 
herein. 

Topical administration can be by creams,* gels, suppositories and the like. Ex 
vivo (extracorporeal) delivery can be as typically used in other contexts. 

25 

' The present invention is more particularly described in the following examples 
which are intended as illustrative only since numerous modifications and variations 
therein will be apparent to those skilled in the art. 



30 
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EXAMPLES 

I, P/CAF studies. 

5 Cloning and characterization of P/CAF protein. 

• In human cells, GBP binds to c-Jun in a phosphorylation-dependent manner in 
association with stimulation of transcription (9). In yeast, GCN4 is believed to be a c- 
Jun counterpart on the basis of similarities in DNA recognition (15) as well as the 
participation of both proteins in UV signahng pathways (16). Yeast genetic screening 

10 has led to the isolation of various cofactors for GCN4, including GCN5 (yGCN5), 
ADA2 (yADA2) and AD A3 (yADA3) (17-19). These factors are considered to 
function as a complex (or in a common pathway) based on genetic and protein-protein 
interaction studies (18-22). Finally, p300/CBP and yADA2 exhibit significant sequence 
similarity within a 50 amino acid region including a Zn^* finger motif (3). Human 

15 counterparts to yGCN5, yADA2, or yADA3 that interact with p300/CBP to mediate 
transcriptional activation by c-Jun were searched for in various nucleotide sequence 
databases. 

Comparison of the yGCN5 protein, sequence with various databases (23) 
20 revealed significant similarities with the two randomly sequenced human cDNAs, 

ETS05039 (24) (P=4.0xl0'^') and NIB2000-5R (P=6.5xlO-^). Given that these cDNAs 
were truncated, human fetal liver and fetal brain cDNA libraries (Clontech) were 
screened with ETS05039 and NIB2000-5R, respectively and complete clones were 
isolated fi*om the human fetal liver cDNA library. The complete sequences revealed that 
25 the ETS05039- and NIB2000-5R-derived clones are encoded by distinct genes but are 
highly related within the protein coding regions (68% identity at the DNA level, 75% 
identity and 86% similarity at the protein level). The former encodes an N-terminal 
region with no sequence similarity to any proteins in the databases besides the yGCN5- 
related C-terminal region, whereas the latter encodes only the yGCN5-related region. 
30 Given that p300/CBP-binding activity was observed in the former polypeptide as shown 
below, it was designated p300/CBP-associated factor (P/CAF), having the amino acid 
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sequence of SEQ ID NO: 1 and the nucleotide sequence of SEQ ID NO: 10 and the latter 
was named human GCN5 (hGCN5), having the amino acid sequence of SEQ ID NO: 5 
and the nucleotide sequence of SEQ ID NO: 11. 



10 



Additionally, an RNA blot (Clontech) was hybridized with a random-primed 
probe made from the cDNA encoding P/CAF. RNA blotting indicated that transcripts 
detected by the P/CAF and hGCN5 cDNAs are ubiquitously expressed, but the former is 
most abundant in heart and skeletal muscle, whereas the latter is most abundant in 
pancreas and skeletal muscle. 



P/CAF-p300/CBP interaction in vitro 

The P/CAF binding site was presumed to reside in the C terminal one third of 
CBP (residues 1,678-2,442) because it was observed that this region, when fused to a 
DNA binding domain, activates transcription (4) in a manner repressed by coexpression 
15 of 12S ElA. This region was divided into 6 overlapping fragments and each was 
expressed in E, coli as a glutathione-S-transferase (GST) fusion protein. GST-CBP 
fusions were incubated with recombinant P/CAF protein and, subsequently, purified 
using glutathione-Sepharose. Co-purified P/CAF was detected by immunoblotting 
analysis. 

20 

To construct GST-flisions, various regions of CBP and p300 were amplified by 
' PCR' A series of deletions of the CBP segment B was created by site-directed in vitro 
mutagenesis (30). These fragments were subcloned into pGEX-2T (Pharmacia). GST- 
fusions were expressed in E. coli and extracted with buffer B [20 mM Tris-HCl (pH 

25 8.0), 5 mM MgCl,, 10% glycerol, 1 miM AEBSF, 0. 1% NP40, 10 ^g/ml of aprotimn, 10 
)ig/ml of leupeptin, 1 jig/ml of pepstatin A, 1 mM DTT] containing 0. 1 M KCl for these 
experiments. GST-CBP-segment B was purified by glutathione-Sepharose and phenyl- 
Sepharose chromatographic steps, P/CAF, hGCNS, and ElA were expressed as FL AG- 
fusions in Sf9 cells via baculovirus vectors and afFmity-purified with M2-agarose (ref 

30 30; Kodak-IBI). For interaction, a crude £. coli extract containing 20 pmol of GST- 
fusion was incubated with 40-60 pmol of P/CAF or El A in a total volume of 50 |il of 
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buffer B with 0. 1 M KCl on ice for 10 min. Samples were further incubated with 10 \xl 
(packed volume) of glutathione-Sepharose at 4°C for 30 min, washed four times with 
200 |il of buffer B containing 0. 1 M KCl, and eluted with 20 |il of buffer E [50 mM 
Tris-HCl (pH 8.0), 0.2 M KCl, 20 mM glutathione] for 60 min. Interacting proteins 
5 were detected by anti-FLAG immunoblotting or silver staining. 

For p300 interactions, the segment spanning residues 1763-1966 (segment B') of 
p300, which is analogous to the CBP segment-B, was used. Twenty percent of the 
P/CAF and hGCN5 inputs and 100% of the El A input were also analyzed. In the GST 
10 precipitation assays, almost identical amounts of the GST fusions were recovered in all 
samples. Interaction between P/CAF and CBP (segment B) was determined in the 
absence and in the presence of El A. Control reactions with GST-CBP alone and 
without GST-CBP were also performed. Input proteins were analyzed. 

15 Two CBP segments, A and B, interacted specifically with P/CAF. The stronger 

interaction was observed in the latter segment,, which does not include the yADA2-like 
Zn^^ finger. Given that the CBP segment-B is well conserved in p300 (66% identity, 
75% similarity), the binding of P/CAF to p300 in vitro was also analyzed. For this 
experiment, the p300 segment spanning residues 1763-1966, termed segment B\ which 

20 is analogous to the CBP segment-B, was used. Like CBP, p300 interacted specifically 
with P/CAF. These studies demonstrated that P/CAF binds specifically to both p300 
and CBP in vitro. In contrast to P/CAF, hGCN5 did not bind to CBP or p300. 

These studies also demonstrated that the Zn"' finger region of p300/CBP, which 
25 shares sequence similarity with yADA2, is not essential for the interaction with P/CAF. 
Cloning of a human structural homolog of yADA2, termed hADA2 (25) has revealed 
that, unlike the sequence similarity between p300/CBP and yADA2, which is restricted 
to a 50 amino acid region, hADA2 shares extensive similarity (30% identity, 52% 
similarity) to yADA2 over the entire protein sequence. Moreover, a computer search of 
30 the complete genomic sequence of Saccharomyces cerevisiae revealed that yeast does 
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not have counterparts of p300/CBP or P/CAP: Thus, the p300/CBP-P/CAF pathway 
may have been acquired during metazoan evolution. 

5 Action of El A in vitro 

Previous reports indicated that El A binds to both the p300 segment spanning 
residues 1767-1816 and the CBP segment spanning residues 1805-1854 (7). These 
interactions were reconfirmed in the present system; thus, both p300 and CBP segments 
covering the previously identified regions interacted with ElA. 



For fijrther mapping, a series of deletions was introduced within the CBP 
segment-B and tested for interactions with P/CAF and ElA. Deletions of residues 
1801-1825 or 1824-1851 markedly reduced interactions with both P/CAF and ElA, 
whereas deletion of residues 1850-1878 did not affect these interactions. Furthermore, 
15 deletion of residues 1801-1851 completely abolished interactions with both P/CAF and 
■ ElA.'These data indicate that residues 1801-1851 of CBP are critical for interaction 
with both P/CAF and ElA. Taken together with the evidence that CBP segment A (aa 
residues 1,678-1,880) also binds to these factors, the above findings demonstrate that 
- P/CAF and El A bind to the same or very closely spaced sites on CBP. 



Evidence that both P/CAF and ElA recognize the same p300/CBP segments 
raises* the possibility of direct competition between P/CAF and ElA for binding to 
p300/CBP. To test this possibility, a competition experiment was performed with the 
use of affinity purified recombinant proteins. The interaction of P/CAF with the CBP- 
25 segment B was progressively inhibited by the addition of increasing amounts of El A. In 
contrast,-no inhibition was caused by an ElA mutant which does not bind to p300/CBP 
(El AAN). Similar results were obtained with the p300-segment B', leading to the 
conclusion that P/CAF and ElA compete for the same binding sites in p300/CBP. 



10 



20 
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P/CAF-p300/CBP interaction in vivo 

The in vivo interaction between P/CAF and p300/CBP was established by co- 
immunoprecipitation from a human osteosarcoma cell extract. Proteins in this extract 
were immunoprecipitated with rabbit anti-P/CAF, rabbit anti-CBP and anti-p300 
5 antibodies. For controls, cell extract was precipitated with rabbit control IgG or mouse 
anti-HA monoclonal antibody. The precipitates were analyzed by immunoblotting with 
anti-P/CAF, anti-CBP and anti-p300 antibodies. 



Osteosarcoma cells were transfected with either control vector or El A- or 
10 E 1 AAN-expression vectors. Extract from the transfected subpopulation was 

immunoprecipitated with anti-P/CAF or control IgG. The precipitates were analyzed by 
immunoblotting with anti-p300 and anti-P/CAF antibodies. 

Rabbit anti-P/CAF antibody was raised to the P/CAF segment spanning residues 

15 125-397 and purified by immunoafFinity chromatography (33). A mixture of 

monoclonal antibodies raised to the human p300 segment spanning residues 1 572-2371 
(5) and rabbit polyclonal antibodies raised to the mouse CBP segment spanning residues 
2-23 (for immunoprecipitation) and 1736-2179 (immunoblotting) were purchased from 
Upstate Biotechnology. Approximately 2 x 10' human osteosarcoma U-2 OS cells 

20 (ATCC accession number HTB 96) were extracted with 10 ml of lysis buffer [25 mM 
HEPES-KOH (pH 7.2), 150 mM potassium acetate, 2 mM EDTA, 1 mM DTT, 1 mM 
AEBSF, 10 |ag/ml of aprotinin, 10 ^ig/ml of leupeptin, 1 |ig/ml of pepstatin A, 20 mM 
sodium fluoride, 0. 1% NP40]. Two to 10 ml of extract were incubated with 2 ^g of the 
respective antibody for four hours at 4*^0 . Fifty |il (packed volume) of protein- A 

25 Trisacryl (Pierce) were added and incubation was continued for two hours. The matrix 
was washed four times with 1 ml of the lysis buffer, then boiled in 2x SDS sample 
buffer. Human osteosarcoma U-2 OS cells were transfected with 20 ^g of the indicated 
plasmid and 1 |ig of sorting plasmid (pCMV-IL2R)(3.1). The transfected subpopulation 
was purified by magnetic affinity cell sorting (32). Extract from approximately 2 x 10^ 

30 sorted cells was immunoprecipitated as described. 
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Anti-P/C AF antibody specifically detected a 95 kDa protein, which is very close 
to the calculated value for the full-length P/CAF, in the immunoprecipitates. Anti- 
P/C AF antibody co-immunoprecipitated both CBP and p300. Similarly, anti-CBP 
antibody also co-immunoprecipitated P/CAF. However, anti-p300 antibody did not co- 
5 immunoprecipitate P/CAF. This is most likely due to steric interference since the anti- 
p300 antibody was raised to the p300 segment spanning residues 1572-2371 which 
includes the P/CAF binding region. These data demonstrate that P/CAF forms 
complexes with both p300 and CBP m vivo. 

10 Action of ElA in vivo 

The //? vitro experiments described herein indicate that P/CAF and ElA compete 
for the binding sites in p300/CBP. Thus, a study was conducted to determine whether 
EI A targets the endogenous interaction between P/CAF and p300. An El A-expression 
vector was transiently transfected into human osteosarcoma cells and the transfected 

15 subpopulatioh was purified by cell sorting. Then, the interaction between P/CAF and 
p300 in transfected cells was examined by co-immunoprecipitation with anti-P/CAF 
antibody. The endogenous interaction of P/CAF with p300 was drastically inhibited by 
expression of El A. On the other hand, no inhibition was observed by the ElA mutant 
lacking the p300 binding domain (ElAAN), indicating that ElA disrupts the P/CAF- 

20 p300 complex in vivo through an interaction with p300. 

Cell cycle regulation by P/CAF 

Given that binding of P/CAF to p300/CBP is inhibited by El A, experiments 
were performed to evaluate whiether P/CAF, by binding to and forming a functional 
25 complex with p300, is involved in the regulation of entry into S phase. This possibility 
was addressed by examining whether transient expression of P/CAF would affect the 
rate of Gl/S transit in HeLa cells. P/CAF negatively affected the distribution of cells 
between Gl and S phases in this assay. 

30 HeLa cells were transfected by electroporation with 7 |ig of P/CAF-expression 

plasmid and/or 3 ^Ig of the full-length or the N-terminally deleted (A2-36) ElA 12S- 
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expression plasmid as indicated. These plasmids were constructed by subcloning 
FLAG-P/CAF and El A cDNAs into pCX (34) and pcDNAI (Invitrogen), respectively. 
All samples, in addition, contained I |ig of sorting plasmid (pCMV-IL2R) (3 1 ) and 
carrier plasmid (pCX) to normalize the total amount of DNA to 1 1 )ig. After 
5 transfection, cells were incubated in Dulbecco's modified Eagle's medium with 10% fetal 
bovine calf serum for 12 h, and subsequently labeled in medium containing 10 |liM 
bromo-deoxyuridine (BrdU) for 30 min. Subsequently, the transfected subpopulation 
was purified by magnetic affinity cell sorting and nuclei were analyzed by dual parameter 
flow cytometry as described (32). 

10 

The fraction of cells accumulating in S phase in control cultures was 23%, 
compared to 15% in P/CAF-transfected cells. This effect was reproducible in multiple 
independent experiments. In parallel experiments to verify the utility of this 
experimental protocol, plasmids encoding E2F-1, simian virus 40 small t, cyclin A or 
1 5 cyclin E increased the accumulation of cells in S phase, whereas plasmids encoding the 
cyclin-dependent kinase inhibitors p2I or p27 reduced the number of S phase cells. 

On the basis of evidence that El A and P/CAF compete for binding sites on 
p?eO, it seemed possible that cotransfection of P/CAF with El A would oppose the 

20 mitogenic effect caused by El A. As shown by the data herein, this is indeed the case. 
El A alone has mitogenic activity in this experimental setting, while the El A mutant 
lacking the p300 binding domain (El AAN) has very weak activity. Comparable 
expression levels between wild type and mutant El A in the transfected cells were 
revealed by immunoblotting analysis with anti-El A. Intriguingly, when P/CAF was 

25 cotransfected with El A, the mitogenic activity of El A was significantly counteracted by 
P/CAF. These results show that P/CAF and El A mediate antagonistic effects on cell 
cycle progression. 

In the course of assessing P/CAF activity, it was also revealed that p300 is able 
30 to inhibit cell cycle progression under the same assay conditions. These findings suggest 
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that P/CAF and p300, perhaps by forming a complex, act in concert to suppress cell 
cycle progression. 

Histone acetyltransferase activity in P/CAF 
5 Acetylation of the N-terminal histone tails has been considered to play a crucial 

role in accessibility of transcription factors to nucleosomal templates (26-27). Recently, 
yGCN5 has been identified as a histone acetyltransferase (28). On the basis of this 
information, intrinsic histone acetyltransferase activity in P/CAF and hGCN5 was 
examined. As substrates, the core histones (histones H2A, H2B, H3 and H4) and the 
10 nucleosome core particles (146 base pairs of DNA wrapped around the octamer of core 
histones) were used. 

Activity of hGCN5 and P/CAF that acetylates free histones or histones in the 
nucleosome core particle (35) was measured as described (36). Each reaction contained 
15 0.3 pmol of affinity purified FLAG-hGCN5 or FLAG-P/CAF, 4 pmol of the histone 
octamer or the nucleosome core particle and 10 pmol of [l-*^C]acetyl-CoA.* The 
histone octamer dissociated into dimers or tetramers under assay conditions. Acetylated 
histones were detected by autoradiography after separation by SDS-PAGE. 



20 P/CAF and hGCN5 acetylated the core histones with almost the same efficiency. 

Both factors acetylated histones H3 and H4, but preferentially H3. In contrast, very 

weak or no acetylation by hGCN5 was detected in the nucleosome core particles. 

Remarkably, significant acetylation by P/CAF was observed in a nucleosomal context. 

Although all core histones are acetylated in the nucleus, P/CAF and hGCN5 did not 
25 acetylate histones H2A and H2B />? vitro. 

Direct function of P/CAF is likely to involve its intrinsic histone acetyltransferase 
activity. Although exact molecular mechanisms by which acetylation of core histones 
contribute to transcription remains undefined, acetylation of the histones is considered to 
30 play an important role in transcriptional regulation (26-27). The positively charged N- 
terminal tails of core histones are believed to affect nucleosome structure by interacting 
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with DNA at or near the nucleosome-spacer junction. Acetylation of the histone tails 
presumably destabilizes the nucleosome and facilitates access by regulatory factors. 
Likewise, there is a general correlation between the level of acetylation and 
transcriptional activity of nucleosomal domains. The findings of the present invention 
5 provide insights into the mechanisms of targeted histone acetylation. 

Cellular factor p300/CBP binds to various sequence-specific factors that are 
involved in cell growth and/or differentiation, including CREB (3,4), c-Jun (9), Fos (11), 
c-Myb (12) and nuclear receptors (13). P/CAF could stimulate the activation function 
10 - of these factors via promoter-specific histone acetylation. The present invention 

demonstrates that El A appears to perturb normal cellular regulation by disrupting the 
connection between p300/CBP and its associated histone acetyltransferase 

II. P300/CBP studies. 

IS - , 

Purification of El A associated histone acetyltransferase. 

FLAG-epitope tagged El A (or AEl A) was expressed in Sf9 cells (ATCC 
accession number CRL 171 1) by infecting recombinant baculovirus (43). All purification 
steps were carried out at 4*'C. Extract was prepared from infected cells by one cycle of 

20 freeze and thaw in buffer B (20 mM Tris-HCl, pH 8.0; 5 mM MgCl2, 10% glycerol; 1 
mM PMSF; 10 mMp-mercaptoethanol; 0.1% Tween 20) containing 0. 1 
M KCl and the complete protease inhibitor cocktail (Boehringer Mannheim)To 
prepare ElA-immobilized beads, the extract was incubated with M2 
anti-FLAG antibody agarose (Kodak-IBI) for four hours with rotating and 

25 subsequently washed with the same buffer three times. The resulting beads were 

incubated with HeLa (ATCC accession number CCL 2) nuclear extract for four to eight 
hours and thereafter washed with the same buffer six times. Finally, FLAG-El A was 
eluted from the beads along with associated polypeptides by incubating with the same 
buffer containing 0. 1 mg/ml FLAG peptide. 



3,0 
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For further purification, eluted polypeptides were dialyzed in 0.05 M KCI-buffer 
B and subsequently loaded onto a SMART Mono Q column (Pharmacia) equilibrated 
with the same 0.05 M KCI-buffer B. After washing, the column was developed with a 
linear gradient of 0.05-1 .0 M KCl in buffer B. Mono Q fractions were concentrated with 
5 a MICROCON spin-filter (Amicon) and consequently loaded onto a SMART Superdex 
200 column (Pharmacia) equilibrated with 0. 1 M KCl-bufFer B. 



Histone acetyltransferase assays 

Filter binding assays were performed as described (80) with minor modifications. 

10 Samples were incubated at 30°C for 10-60 minutes in 30 ml of assay buffer containing 
50 mM Tris-HCl, pH 8.0; 10% glycerol; 1 mM DTT, I mM PMSF, 10 mM sodium 
butyrate; 6 pmol of [^H]acetyl CoA (4.3 mCi/mmole, Amersham Life Science Inc.); and 
33 mg/ml of calf thymus histones (Sigma Chemical Co.). In experiments where synthetic 
peptides were substituted for core histones, 50 pmol of each peptide were used. After 

15 incubation, the reaction mixture was spotted onto Whatman P-81 phosphocellulose filter 
paper and washed for 30 minutes with 0.2 M sodium carbonate buffer pH 9.2 at room 
temperature with 2-3 changes of the buffer. The dried filters were counted in a liquid 
scintillation counter. 

20 PAGE analysis' was done as above except that 90 pmol of [^■^CJacetyl CoA (55 

mCi/mmole, Amersham Life Science Inc.) and 9 pmol of core histones or 
mononucleosomes were used. Core histones and mononucleosomes were prepared as 
described (35). For trypsin digestion, reaction mixtures were further incubated with 
various amounts of trypsin on ice for 30 minutes. The samples were analyzed on one 

25 dimensional SDS-PAGE gels or two dimensional gels, where the first dimension was an 
acid-urea-PAGE gel (44) and the second dimension was an SDS-PAGE gel. 



Protein expression 

For baculovirus expression, cDNAs corresponding to p300 portions of aa 1-670, 
30 aa 671-1 194 and aa 1 135-2414 were amplified by PCR (EXPAND High Fidelity PCR 
System; Boehringer Mannheim) as KpnI-NotI fragments. The resulting fi-agments were 
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subcloned into a baculovirus transfer vector having the FLAG-tag sequence (43). The 
recombinant viruses were isolated using the BACULOGOLD system (Pharmingen), 
according to the manufacturer's protocol and were infected into Sf9 cells (ATCC 
accession number CRL 171 1) to express FLAG-p300. Recombinant proteins were 
5 affinity purified with M2 anti-FLAG antibody-immobilized agarose (Kodak-IBI) 
according to the manufacturer's protocol. 

For bacterial expression, cDNAs encoding the p300 portions and the CBP 
portion (aa 1 174-1850) were first subcloned into the baculovirus transfer vector having 
10 the FLAG-tag as described above. Thereafter, the Xhol and NotI fragments encoding 
FLAG-p300 or FLAG-CBP fiisions were resubcloned into the E. coli expression vector 
pET-28c (Novagene) digested with Sail and Notl. Recombinant proteins were 
expressed in E. coli BL21(DE3) and aflfmity purified with M2-antibody agarose. 

15 Histone acetyltransferases that associate with El A 

Although the adenovirus El A 12S protein (El A) inhibits transcription in a 
, variety of genes via direct binding to p300/CBP (45), El A also stimulates transcription 
in some contexts (46). Thus, p300/CBP-bound El A was tested to determine whether it 
might recruit histone acetyltransferases or deacetylases to regulate transcription. In 
20 addition, experiments were conducted as described below to determine if p300/CBP per 
se is a histone acetyltransferase. 

Initially, recombinant FLAG-epitope tagged El A was immobilized on 
anti-FLAG antibody beads. Immobilized El A was incubated with a HeLa nuclear 

25 extract for affinity purification of El A-associated polypeptides. FLAG-EIA 
was then eluted from the beads, along with El A-associated polypeptides, by 
incubating with FLAG-peptide. Although El A per se has no histone acetyltransferase 
activity. El A recruited significant amounts of histone acetyhransferase activity from the 
nuclear extract. It is very unlikely that this activity is derived from P/CAF given that 

30 El A and P/CAF cannot bind to p300/CBP simuhaneously (43). Consistent with this, no 
P/CAF was detected in these fractions by immunoblotting. 
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The El A N-terminus, a region that is not highly conserved among the various 
adenovirus serotypes, is involved in p300/CBP binding in vivo. Mutations in the 
N-terminal region lead to loss of the ability for p300/CBP binding without affecting RB 
5 binding (1,47). Thus, the requirement of the El A N-terminal region for the recruitment 
of histone acetyltransferase activity was tested. In contrast to the wild type, the 
N-terminal deleted form of El A (AN-El A) recruited only a background level of 
acetyltransferase activity. In agreement with previous reports (47), the AN-El A 
showed no ability to interact with p300/CBP, although it still retained the ability to 
10 interact with a variety of other polypeptides, including RB 

To define the relationship between p300/CBP and histone acetyltransferase 
activity, affinity purified El A-binding polypeptides were separated by Mono Q 
ion-exchange column. Both p300/CBP and the acetyltransferase activity were coeluted 

15 at 140 mM KCl, while most of polypeptides were eluted at 260 mM KCl. The active 
* fraction of Mono Q column (-140 mM KCl) was further separated by Superdex-200 gel 
filtration column. Both p300/CBP and the acetyltransferase activity coeluted after the 
void volume, indicating that p300/CBP is involved in the histone acetyltransferase 
activity. 

20 ' - 

,p300 is a histone acetyltransferase 

The data provided herein indicate that p300 per se, or a polypeptide(s) 
associated with p300, possesses histone acetyltransferase activity. To test the former 
possibility, the acetyltransferase activity of recombinant p300 was measured. p300 was 

25 divided into three fragments, each of which was expressed in and purified from Sf9 cells 
via a baculovirus expression vector. Histone acetyltransferase activity was readily 
detected in the C-terminal fragment containing amino acids 1 135-2414, whereas no 
activity was found in the other fragments, demonstrating conclusively that p300 per se is 
a histone acetyltransferase: 



30 
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p300/CBP-histone acetyltransferase domain 

To map the histone acetyltransferase domain of p300, a series of deletions 
was prepared. Given the poor conservation of the giutamine-rich region (aa 1815-2414) 
in the C elegans p300/CBP homolog (6), the p300 fragment encoding aa 1135-1810 
5 was expressed in and purified from/?. coU. Importantly, this candidate region of p300 
(aa 1 135-1810) showed significant histone acetyltransferase activity. For further 
mapping within this region, a series of N-terminal deletions was constructed. Deletion 
of 60 residues, resulting in a fragment containing aa 1 195-1810, had no effect on the 
acetyltransferase activity, whereas the deletion of 185 residues, yielding a fragment 
10 comprising aa residues 1320-1810, completely eliminated the acetyltransferase activity. 

Next, a series of C-terminal deletions was analyzed to determine the requirement 
of the P/CAF (or El A) -binding domain. The p300 fi-agments lacking the El A binding 
domain (aa 1 195-1760, 1 195-1706 and 1 195-1673) still retained the acetyltransferase 

15 activity, whereas the further truncated mutant (aa 1 195-1652) completely lost the 

acetyltransferase activity. Consistent with these results, the internal deletion of residues 
1418-1720 showed no acetyltransferase activity. These data demonstrate that the 
histone acetyltransferase domain is located between the bromodomain and the 
El A-binding domain. Given that the histone acetyltransferase domain is highly 

20 conserved between p300 and CBP (91% similarity), the corresponding region of CBP, 
aa residues 1 174-1850, was expressed to confirm the acetyltransferase activity. As 
expected, comparable activity was detected, indicating that both p300 and CBP are 
histone acetyltransferases. 

25 Among various acetyltransferases including histone acetyltransferases GCN5 and 

P/CAF, putative acetyl-CoA binding. sites are conserved (48). However, multiple 
alignment analysis (49) showed that the p300/CBP histone acetyltransferase domain 
does not belong to this group. Moreover, comparison of the p300/CBP histone 
acetyhransferase domain with peptide sequence databases (23) showed no sequence 

30 similarity to any other proteins. Accordingly, this invention shows that p300/CBP 

represents a novel class of acetyhransferases in that it does not have the conserved motif 
found among previously described acetyltransferases (48). 
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p300 acetylates all core histones in mononucleosomes 

Substrate specificity for acetylation by p300 was also examined. As substrates, 
histone octamers and mononucleosomes (146 base pairs of DNA wrapped around the 
octamer of core histones) were used. Given that the histone octamer dissociates into 
5 dimers or tetramers under physiological conditions, the histone octamer is referred to 
here as core histones. When core histones were used, p300 acetylated all four proteins, 
. but preferentially H3 and H4. More importantly, in a nucleosomal context, p300 
acetylated all four core histones nearly stoichiometrically. In contrast, p300 acetylated 
neither BSA nor lysozyme. 

10 

Hyperacetylated histones are believed to be linked with transcriptionally active 
chromatin (26,27,50,51). Hyperacetylated forms are found in histones H4, H3 and H2B, 
which have multiple acetylation sites in vivo. Thus, the. level of acetylation by p300 was 
also tested. 

15 

Mononucleosomes treated with p300 were analyzed by two-dimensional gel 
electrophoresis. A Coomassie blue-stained gel and the corresponding autoradiogram 
showed that a significant amount of histones, especially H4, were hyperacetylated. 
Importantly, acetylation levels by p300 were very close to those of hyperacetylated 
20 histones prepared from HeLa nuclei treated with sodium butyrate, a histone deacetylase 
c.unhibitor. In contrast, no acetylated forms were detected in the reaction . 
J without p300. These results indicate that p300 acetylates histones in mononucleosomes 
to the hyperacetylated state by targeting multiple lysine residues. 

25 

p300 acetylates the four lysines in the histone H4 N-terminal tail in vitro which are 
acetylated ih vivo 

Lysines at positions 5,- 8, 12 and 16 of histone H4 are acetylated in vivo 
(51). Recent studies with yeast histone acetyltransferases demonstrate the 
30 position-specific acetylation by distinct acetyltransferases, i.e., while cytoplasmic 
acetyltransferases for histone deposition and chromatin assembly modify 
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positions 5 and 12, GCN5 modifies positions 8 and 16 (52). Accordingly, the positions 
of acetylation by p300 were also determined. A series of synthetic peptides containing 
acetylated lysines at various positions was used to determine the acetylation 
site-specificity of p300. Consistent with the two-dimensional gel electrophoresis 
5 analysis, the experiments with peptide substrates showed that p300 acetylates all four 
lysines in the histone H4 that are acetylated in vivo. These results are consistent with the 
view that deposition-related diacetylated histones are deacetylated during maturation 
of chromatin (53). 

10 p300 preferentially acetylates the N-terminal histone tail 

Histone acetyltransferases modify specific lysine residues in the N-terminal 
tail of core histones but not the C-terminal globular domain in vivo (26,27,50,5 1). 
Structural models of nucleosomes (54,55,56) suggest that most of the lysine residues in 
the C-terminal globular domain are buried. Therefore, experiments were conducted to 

15 examine whether restricted acetylation of the N-terminal tail resulted from the substrate 
. ; . specificity of the enzyme or inaccessibility of the enzyme to the core domain in 

nucleosomes. The globular domains of all core histones contain a long helix flanked on 
either side by a loop segment and short helix, termed the "histone fold" (54,55,56). 
The histone fold is involved in formation of the stable H2A-H2B and H3-H4 

20 hetero-dimers, consisting of extensive hydrophobic contacts between the paired 

molecules. Therefore, it is likely that a histone monomer cannot fold properly, thereby 
increasing access of the histone acetyltransferase to the core domain. Based on these 
considerations, experiments were conducted to determine whether p3 00. acetylates free 
histone H4 in a N-terminal-specific manner. 

25 

Histone H4 was acetylated with p300 and subsequently the histone tail was 
removed by partial digestion with trypsin. The distributions of radioactivity between 
intact and core histones were compared. While the globular core histone domain was 
predominant at the higher trypsin concentrations, radioactivity was detected mostly in 
30 the intact histone. These data demonstrate that p300 preferentially- acetylates the , 
N-terminal tail of histone H4. 
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5 III. P/CAF interaction with MyoD 

Tissue culture and transfection experiments 

C2C12 mouse cells (ATCC accession number CRL 1 772) were grown in 
Dulbecco's modified Eagle medium (DMEM) supplemented with 20% fetal bovine 

10 serum (FBS) until they reached confluence. Differentiation was induced by switching 
medium to differentiation medium (DM), consisting of DMEM containing 2% horse 
serum. C3H/IOTI/2 fibroblasts (ATCC accession number CCL 226) were grown in 
DMEM supplemented with 10% FBS. Cells were transfected by the calcium phosphate 
precipitation method. Total amounts of transfected DNA were equalized by empty 

15 vector DNA. After 12 h incubation in medium containing the precipitated DNA^ the 
cells were washed and incubated in fresh DMEM containing 10% FBS for an additional 
24 h. Afterwards, differentiation was induced by incubating in DM for 36 to 72 h. 
Chloramphenicol acetyltransferase (CAT) assays were performed as previously 
described (64,69): The quantities of cell extracts used for CAT assays were normalized 

20 to p-galactosidase activity by cotransfection of 1 mg of the P-galactosidase expression 
-vector, pON260. 

Expression vectors used for transfection experiments are as follows: 
pCX-P/CAF for P/CAF (43), pCMV-bp300 for p300 (65), pCMV-p300 (1869-2414) 
25 (64) and pCMV-p300 (1514-1922) (60) for p300 wild type and mutants; pEl A12S, 
pEl A12S R2G, pEl A12S D2-36 and pEl A12S D121-130 for El A wild type and 
mutants (66,67,68); and pEMSV-MyoD for MyoD (64). 



30 



■ The antisense P/CAF RNA expression vector, pcDNA3 P/CAF-AS; was created 
as follows. The 2.5 Kb EcoRI-Kpnl fragment containing the entire P/CAF open reading 
fi-ame was isolated from pCX-P/CAF (43). This fragment was subcloned into the 
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EcoRI-Kpnl sites of plasmid pcDNA3 (Invitrogen) so that the antisense P/CAF RNA is 
driven under the CMV promoter. Reporter genes employed were 4RE-CAT and 
MCK-CAT (69). 4RE-CAT is driven by a synthetic promoter containing 4 copies of the 
E-box, whereas MCK-CAT is driven by the native MCK promoter (nucleotides -1256 to 

5 +7). . . 

Microinjection and immunofluorescence 

. Cells were grown on small glass slides, subdivided into numbered squares of 2 
mm X 2 mm and microinjected with purified and concentrated antibodies, as previously 
10 described (70); For immunofluorescence, cells were fixed in either 2% 

paraformaldehyde or 1:2 methanoL/acetone solution, preincubated with 5% BSA/PBS 
and incubated with the primary antibodies for 30 min at 37° C. Subsequently, antibody 
was visualized by incubating with either rhodamine- or fluorescein-conjugated secondary 
antibody for 30 min at 37"* C. Injected antibodies were stained with a 
15 rhodamine-conjugated secondary antibody and nuclei were counter-stained by DAPI as 
• previously described (69). 

Antibodies employed are as follows; rabbit polyclonal affinity purified 
arrti-P/CAF antibody (43), rabbit polyclonal anti-p300/CBP antiserum (71), mouse 
20 monoclonal anti-MyoD antibody (clone 5. 8 A, kindly provided by P. Houghton), goat 
polyclonal anti-c-Jun affinity purified antibody (Santa Cruz) and rabbit pre-immune 
serum. 



25 

Immunoprecipitation and DNA affinity purification 

Cells were resuspended in lysis buffer (20 mM NaP04, 1 50 mM NaCl, 5mM 
MgCl2, 0.1% NP40, 1 mM DTT, 10 mM sodium fluoride, 0. 1 mM sodium vanadate, 1 
mM phenylmethylsulfonyl-fluoride and 10 mg/ml each of leupeptin, aprotinin and 
30 pepstatin). After 30 min incubation on ice, samples were centrifuged at 12,000 x g for 
30 min and supernatants were used as cell extracts. Extracts were pre-cleared by 
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incubating with rabbit pre-immune serum and protein A/G Plus-Agarose (Santa Cruz) 
for 2 h at 4 C. For immunoprecipitation, the supernatants were incubated with the 
respective antibodies for 3 h at 4 C. Protein A/G Plus- Agarose was added, and 
incubation continued for 3 h. The matrix was washed with lysis buffer, then boiled in 2 
5 X SDS sample buffer. Immunoblotting was performed by using the ECL 

chemiluminescent detection kit (Amersham) according to the manufacturer's protocol. 

Affinity purification of E-box-bound complexes was done as previously 
described (69). Briefly, 100 ng of the biotinylated double stranded DNA containing the 
10 E-box were immobilized on streptavidin-conjugated magnetic beads and incubated with 
500 mg of cell extracts in the presence of poly dl-dC. After extensive washing, bound 
proteins were eluted with SDS sample buffer and analyzed by immunoblotting. 

In vitro protein-protein interaction assays 

1 5 The CBP-B fragment and its deletion derivatives were expressed as 

GST-fusions described previously (43). MyoD and El A (43) were expressed as 
FLAG-fusion proteins in Sf9 cells via a baculovirus expression system and 
aflfinity-purified on M2 anti-FLAG antibody-agarose (Kodak-IBI). Crude E. coli 
extracts containing GST-fusions were incubated with various amounts of MyoD and/or 

20 El A in 50 ml of buffer B (20 mM Tris-HCl, pH 8.0, 0. 1 M KCl, 5 mNI MgCU, 10% 
* .-glycerol, and 0:1% Nonidet P-40) on ice for 10 min. GST-precipitation was performed 
as described (43). MyoD and El A were detected by immunoblotting with anti-FLAG 
M2 antibody. For the interaction between P/CAF and MyoD, l .S pmolof 
FLAG-P/CAF and 15 pmol of FLAG-MyoD were incubated in 50 ml of buffer B on ice 

25 for 10 min. The mixture was further incubated with 2 mg of anti-P/CAF (43) or 
anti-hADA2 antibody for 60 min. The immunocomplexes were precipitated by 
incubation with > ^ mi of protein A-Trisacryl (Pierce) and rotated for 1-4 hr at 4oC. The 
matrix was washc vi 4 times w^ith 200 ml of buffer B and boiled in 10 ml of 2 X SDS 
sample buffer. The proteins were resolved on a 4%-20% gradient SDS-PAGE and 

30 subjected to immunoblotting with the anti-FLAG M2 antibody. The blot was developed 
with the SUPERSIGNAL chemiluminescent substrates (Pierce). 



wo 98/03652 




PCT/US97/12877 



P/CAF coactivates muscle-specific transcription 

P/CAF and MyoD were co-transfected into mouse C3H10T1/2 fibroblasts, and 
MyoD-mediated transcription was determined from reporter activity driven by the 
5 artificial (4RE) and the naturally-occurring muscle creatine kinase (MCK) promoters. 
Overexpression of P/CAF stimulated MyoD-dependent transcription several folds in 
both promoters. Similar resuhs were obtained for the myoD activated myogenin 
promoter Transcriptional activation was ftirther stimulated by co-transfecting with 
MyoD, P/CAF and p300 expression vectors, suggesting that P/CAF may function by 

10 forming a complex with p300/CBP. Consistent with the lack of DN A binding capacity in 
P/CAF, overexpression of P/CAF alone did not increase the basal transcriptional activity 
of either enhancer. To test whether P/CAF and p300/CBP function in the same pathway, 
two dominant negative forms of p300 were employed which specifically inhibit 
p300/CBP-mediated transcription (60,64). The p300 segment spanning residues 

1 5 1 5 14-1922 inhibits the MyoD-dependent activation via direct interaction with MyoD 
(60), whereas the p300 segment spanning residues 1869-2414 inhibit it without direct 
interaction (64). Both dominant negative mutants inhibited MyoD-coactivation by 
P/CAF), suggesting that P/CAF and p300/CBP function in the same pathway. 

20 ^ For further elucidation of the activation mechanism by P/CAF, the effect of El A, 

which inhibits MyoD-dependent transcription and differentiation (65,72,73) via direct 
interaction with p300/CBP (65,78), was tested. Expression of El A in C3H10T1/2 
fibroblasts inhibited stimulation of MyoD-directed transcription by P/C AF 
overexpression. El A mutants lacking p300/CBP-binding activity, El A D2-36 and El A 

25 R2G (67,79), had almost no effect. On the other hand, an El A mutant retaining 
p300/CBP-binding activity. El A D121-130, behaved Hke the wild type. Since El A 
associates with p300/CBP, but not with P/CAF, these results suggest that P/CAF 
fianctions in MyoD-directed transcription via interaction with p300/CBP. . 

30 To address the role of P/CAF as a myogenic coactivator in a more relevant 

environment, P/CAF was overexpressed in proliferating C2C12 myoblasts which express 
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endogenous myogenic bHLH factors. As observed in fibroblasts, overexpression of 
P/CAF stimulated muscle specific transcription. Concomitant expression of exogenous 
p300 increased P/CAF-mediated coactivation. The repression exerted by wild type El A, 
but not mutant El A D2-36, on P/CAF coactivation of MyoD was also observed in 
5 muscle cells. 

Similar experiments were performed with myogenic ceil lines that were stably 
transformed with wild type or mutant El A-expressing vectors (66). Coactivation by 
P/CAF was inhibited by wild type El A or the El A mutant that retains 
10 .p300/CBP-binding activity (El AA 121-1 30). In contrast, El A mutants that lack 

p300/CBP-binding (El A A2-36 and El A R2G) allowed transcriptional coactivation by 
P/CAF. Taken together, these experiments show that P/CAF coactivates MyoD-directed 
transcription via interaction with p300/CBP. 

15 P/CAF stimulates myogenic differentiation 

Given that P/CAF potentiates MyoD-directed transcription, the ability of P/CAF 
to assist MyoD in promoting myogenic differentiation was investigated. To this aim, 
C3H10T1/2 fibroblasts were transiently transfected with P/CAF and MyoD expression 
vectors. An expression vector for the green fluorescent protein (GFP) was 
• 20 co-transfected to identify transfected cells. After incubation in differentiation medium, 
- the myogenic conversion of transfected cells was determined by simultaneous expression 
: .of the GFP and the differentiation-specific marker myosin heavy chain (MHC). Forced 
expression of MyoD in fibroblasts caused muscle differentiation in 12% of the 
transfected fibroblasts. This myogenic conversion was 20% by co-expressing MyoD and 
25 P/CAF. As observed In transcription experiments, stimulation of differentiation by 
P/CAF was counteracted by co-transfection with the p300 dominant negative mutant, 
p300 (1869-2414). Consistent with a general role for coactivators, overexpression of 
P/CAF alone was unable to differentiate fibroblasts. 

30 Similar experiments were done using proliferating C2C12 myoblasts in which the 

differentiation program is already committed. Most of the myoblasts differentiated into 
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myotubes by overexpressing P/CAF, whereas only a modest effect was observed by 
overexpressing p300. In contrast, differentiation was inhibited sHghtly by overexpressing 
c-Jun. This inhibitory effect presumably was caused by titration of p300/CBP, which 
associates directly with c-Jun (74). A similar inhibition was observed in the p300 
5 dominant negative mutant. Consistent with the transcriptional effect, El A almost 

completely inhibited differentiation. The El A mutant RG2, lacking p300/CBP-binding 
capability but retaining the retinoblastoma protein (Rb)-binding capability, only partially 
inhibited differentiation, although this same mutant 

inhibited transcription as severely as the wild type. Taken together, these data show that 
10 P/CAF stimulates muscle differentiation by coactivating MyoD function via p300/CBP 
association. 

P/CAF is essential for myogenic transcription and differentiation 

To test the necessity of P/CAF for myogenic transcription, experiments were 
15 conducted whereby P/CAF synthesis was inhibited by expressing antisense P/CAF RNA. 
A vector from which the P/CAF mRNA is transcribed in the antisense orientation 
(P/CAF- AS) was transfected with P/CAF and MyoD expression vectors into fibroblasts 
and MyoD-dependent transcription was examined. Cotransfection of the antisense 
expression vector strongly inhibited MyoD-dependent transcription below the level of 
20 induction elucidated by MyoD alone, demonstrating that expression of P/CAF antisense 
RNA inhibits not only the coactivation exerted by exogenous P/CAF but also that of 
endogenous P/CAF. These results indicate that P/CAF is essential for MyoD-dependent 
. transcription. 

25 Studies were also carried out to determine whether expression of P/CAF 

antisense RNA inhibits myogenic differentiation. C3H10T1/2 fibroblasts were transiently 
transfected with various expression vectors with or without the P/CAF antisense RNA 
expression vector. Expression of P/CAF antisense RNA reduced MyoD-mediated 
myogenic conversion of fibroblasts. Expression of P/CAF antisense RNA also 

30 counteracted the stimulatory effect of both P/CAF and p300 on myogenic 

differentiation. These data support the view that P/CAF and.p300/CBP coactivate 
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MyoD-dependent transcription in the same pathway. More drastic inhibition was 
observed in C2C12 myoblasts in similar experiments. Therefore, it can be concluded that 
P/CAF is essential for transcription of muscle specific genes and hence differentiation 
into myotubes, 

5 

To further confirm the essential role of P/CAF for myogenic differentiation, we 
blockage experiments by antibody microinjection were performed. Antibodies were 
injected into the cytoplasm of proliferating C2C12 myoblasts to prevent the nuclear 
transport of newly synthesized target proteins. After incubating in the differentiation 

10 medium, the degree of differentiation was determined. Microinjection of an anti-P/CAF 
antibody almost completely inhibited differentiation. Similar results were obtained by 
microinjecting anti-p300/CBP antibodies. Although microinjection of either 
anti-p300/CBP or P/CAF antibody was sufficient to inhibit differentiation, an even 
greater inhibition was observed by coinjecting both of them. Microinjection of 

15 anti-P/CAF or anti-p300/CBP antibody did not interfere with induction of p53 by DNA 
damaging agents, showing specificity of the inhibition by the antibodies In contrast to 
anti-P/CAF or anti-p300/CBP antibodies, the injection of anti-MyoD antibody only 
partially inhibited differentiation, supporting the view of functional redundancy between 
MyoD and Myf-5 (75,76). Injection of anti-c-Jun antibody or control antibody did not 

20 interfere with muscle differentiation. 

Similar experiments were performed with C3H10T1/2 fibroblasts stably 
expressing MyoD. In these cells, either anti-p300/CBP or anti-P/CAF antibody 
completely inhibited muscle differentiation. In contrast to myoblasts, anti-MyoD 
25 antibody completely blocked differentiation in the fibroblasts expressing MyoD. 

Anti-c-Jun and control antibodies did not interfere with differentiation. Taken together, 
these results demonstrate that P/CAF and p300/CBP are indispensable for activation of 
the myogenic program. 



30 
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p300/CBP, P/CAF and MyoD form a multimeric complex in vivo 

The data described above indicate that P/CAF stimulates MyoD-directed 
transcription via association with p300/CBP. Thus, experiments were conducted to 
investigate whether P/CAF, p300/CBP and MyoD could associate in a complex. 
5 First, cellular extracts derived from C2C12 myotubes were subjected to 

immunoprecipitation. Both anti-MyoD and anti-p300/CBP antibodies co-precipitated 
P/CAF. In a complementary experiment, both anti-p300/CBP and anti-P/CAF 
antibodies also co-precipitated MyoD, suggesting that these factors form a multimeric 
protein complex in myotubes. 

10. 

Next, attempts were made to detect this complex on the E-box, the DNA 
binding site for MyoD. Immobilized DNA containing an E-box sequence was incubated 
with myotube extracts. After extensive washing, P/CAF, p300/CBP and MyoD were 
analyzed by immunoblotting. P/CAF, p300/CBP and MyoD were all affinity purified on 
15 the immobilized DNA, whereas they were not purified on the control DNA lacking the 
E-box. Given that P/CAF and p300/CBP per se cannot bind to DNA, these observations 
indicate that P/CAF and p300/CBP are recruited through MyoD at the E-box sites to 
form a multi-protein complex. 

20 Complex formation is inhibited by viral transforming factors 

Since the oncoviral proteins El A and large T antigen inhibit myogenic 
transcription and differentiation, the effect of these factors on the formation of 
complexes on the E-box was tested. Importantly, very small amouts of P/CAF and 
p300/CBP were co-purified on the E-box from myocyte extracts which stably express 
25 El A or large T antigen, ahhough MyoD was detected under these conditions. The lower 
recovery of MyoD from El A-expressing muscle cells could reflect the low level of 
MyoD in the extracts (66). These results indicate that El A and large T antigen 
dissociate P/CAF and p300/CBP from MyoD without altering MyoD binding to DNA. 

30 Consistent with the previous observations that transiently expressed El A 

prevents interaction between P/CAF and p300/CBP in vivo (43), the association 
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between p300/CBP and P/CAF was abolished in myoblasts stably transformed by wild 
type El A but not in those clones transformed with the El A mutant R2G unable to bind 
p300/CBP- Similarly, the interaction between p300/CBP and P/CAF was abolished by 
large T antigen but not by the mutant protein that localizes into the cytoplasm (77).- 

5 

Interaction between MyoD, P/CAF and CBP /« viVro 

Previous interaction experiments i?i vitro indicate that the CBP region spanning 
residues 1801 to 1850 is crucial for interaction with both P/CAF and El A (43). While 
most sequence-specific factors bind to CBP sites distinct from the P/CAF/El A binding 

10 sites, MyoD interacts with an overlapping CBP firagment called the CH3 region 

(60,64,65). To understand how P/CAF, p300/CBP and MyoD associate, the CBP sites 
important for MyoD binding were mapped more precisely. Consistent with previous 
reports (60,64,65), the CBP fragment spanning residues 1801-2000 (fragment B) bound 
MyoD. Moreover, deletion of residues 1801 to 1850 within fragment B completely 

15 abolished interaction with MyoD, which is similar to the results obtained with P/CAF 
and El A. Importantly, an internal deletion of residues 1850-1878 abolished the MyoD 
interaction with CBP, while it did not affect binding of El A or P/CAF (43). These 
results suggest that MyoD and P/CAF bind to distinct sites of p300/CBP, albeit the 
binding sites may overlap. Moreover, a direct interaction was observed between MyoD 

20 and P/CAF, which may contribute to stabiUzation of the muUimeric complex. 

These data show that El A prevents not only p300/CBP-interaction with 
P/CAF but also that with MyoD in vivo. To obtain evidence that this 
inhibition is due to the direct action by El A, competition experiments were performed 
25 in vitro. Importantly, the interaction between CBP and MyoD was strongly inhibited by 
addition of El A, implicating that El A inhibits myogenic transcription by disrupting 
multiple interactions. 



30 



Although the present process has been described v^th reference to specific 
details of certain embodiments thereof, it is not intended that such details should be 
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regarded as limitations upon the scope of tlie invention except as and to the extent that 
they are included in the accompanying claims. 



Throughout this application various publications are referenced by numbers 
5 within parentheses. Full citations for these publications are as follows. The disclosures 
of these publications in their entireties are hereby incorporated by reference into this 
application in order to more fliUy describe the state of the art to which this invention 
pertains. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION 

(i) APPLICANT: The United States of America, as repesented by the 
Secretary, Department of Health and Human Services, c/o 
National Institutes of Health, Office of Technology Transfer, 
6011 Executive Boulevard, Suite 325, Rockville, Maryland 20842 

(ii) TITLE OF THE INVENTION: METHODS AND COMPOSITIONS FOR 

p300/CBP-ASSOCIATED TRANSCRIPTIONAL CO-FACTOR P/ CAF 

(iii) NUMBER OF SEQUENCES: 18 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: NEEDLE i ROSENBERG, P.C. 

(B) STREET: Suite 1200, 127 Peachtree Street, NE 

(C) CITY: Atlanta 

(D) STATE: GA 

(E) COUNTRY: USA 

(F) ZIP: 30303 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 23-JUL-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: Corresponding U.S. Serial No. 60/022,273 

(B) FILING DATE: 23-July-1996 



(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Miller, Mary L 

(B) REGISTRATION NUMBER: 39,303 

(C) REFERENCE /DOCKET NUMBER: 14014. 0238/P 

(iX) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 404/688-0770 

(B) TELEFAX: 404/688-9880 

(C) TELEX*: 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 832 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 
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(XI) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

Met Ser Glu Ala Gly Gly Ala Glv Pro Gly Gly Cys Gly Ala Gly Ala 

1 5 . ' 10 15 

Gly Ala Gly Ala Gly Pro Gly Ala Leu Pro Pro Gin Pro Ala Ala Leu 

20 25 30 " 

Pro Pro Ala Pro Pro Gin Gly Ser Pro Cys Ala Ala Ala Ala Gly Gly 

35 40 45 

Ser Gly Ala Cys Gly Pro Ala Thr Ala Val Ala Ala Ala Gly Thr Ala 

50 55. 60 

Glu Gly Pro Gly Gly Gly Gly Ser Ala Arg lie Ala Val Lys Lys Ala 
65 70 75 80 

Gin Leu Arg Ser Ala Pro Arg Ala Lys Lys Leu Glu Lys Leu Gly Val 

85 90 95 

Tyr Ser Ala Cys Lys Ala Glu Glu Ser Cys Lys Cys Asn Gly Trp Lys 

100 105 110 

Asn Pro Asn Pro Ser Pro Thr Pro Pro Arg Ala Asp Leu Gin Gin lie 

115 120 125 

lie Val Ser Leu Thr Glu Ser Cys Arg Ser Cys Ser His Ala Leu Ala 

130 135 140 

Ala His Val Ser His Leu Glu Asn Val Ser Glu Glu Glu Met. Asn Arg 
145 150 .155 160 

Leu Leu Gly lie Val Leu Asp Val Glu Tyr Leu Phe Thr Cys Val His 

165 170 175 

Lys Glu Glu Asp Ala Asp Thr Lys Gin Val Tyr Phe Tyr Leu Phe Lys 

180 185 190 • 

Leu Leu Arg Lys Ser lie Leu Gin Arg Gly Lys Pro Val Val Glu Gly 

195 200 205 

Ser Leu Glu Lys Lys Pro Pro Phe Glu Lys Pro Ser lie Glu Gin Gly 

210 215 220 

Val Asn Asn Phe Val Gin Tyr Lys Phe Ser His Leu Pro Ala Lys Glu 
225 230 235 ' 240 

Arg Gin Thr lie Val Glu Leu Ala Lys Met Phe Leu Asn Arg lie Asn 

245 250 255 

Tyr Tirp His Leu Glu Ala Pro Ser Gin Arg Arg Leu Arg Ser Pro Asn 

260 265 270 

Asp Asp He Ser Gly Tyr Lys Glu Asn Tyr Thr Arg Trp Leu Cys Tyr 

275 280 285 

Cys Asn Val Pro Gin Phe Cys Asp Ser Leu Pro Arg Tyr Glu Thr Thr 

290 295 300 

Gin Val Phe Gly Arg Thr Leu Leu Arg Ser Val Phe Thr Val Met Arg 
305 310 315 320 

Arg Gin Leu Leu Glu Gin Ala Arg Gin Glu Lys Asp Lys Leu Pro Leu 

325 330 335 

Glu Lys Arg Thr Leu He Leu Thr His Phe Pro Lys Phe Leu Ser Met 

340 345 350 

Leu Glu Glu Glu Val Tyr Ser Gin Asn Ser Pro lie Trp Asp Gin Asp 

355 360 365 

Phe Leu Ser Ala Ser Ser Arg Thr Ser Gin Leu Gly He Glh Thr Val 

370 375 380 

He Asn Pro Pro Pro Val Ala Gly Thr He Ser Tyr Asn Ser Thr Ser. 
385 390 395 400 

Ser Ser Leu Glu Gin Pro Asn Ala Gly Ser Ser Ser Pro Ala Cys Lys 

405 410 415 

Ala Ser Ser Gly Leu Glu Ala Asn Pro Gly Glu Lys Arg Lys Met Thr 

420 425 430 

Asp Ser His Val Leu Glu Glu Ala Lys Lys Pro Arg Val Met Gly Asp 

435 440 445 

He Pro Met Glu Leu He Asn Glu Val Met Ser Thr lie Thr Asp Pro 

450 455 460 

Ala Ala Met Leu Gly Pro Glu Thr Asn Phe Leu Ser Ala His Ser Ala 
465 470 475 480 
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Arg 


Asp 


KjJL U 






Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


Glv 


Vai 


lie 


Glu 


Phe 






4 8 5 










490 










495 




His 


v SiX 


V a J. 


y 

500 


As n 


Ser 


Leu 


Asn 


Gin 
505 


Lys 


Pro 


Asn 


Lvs 


Lvs 
510 


lie 


Leu 


Met 


Trp 


Leu 


V d X 




Leu 


Gin 


As n 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 




SIS 










520 










525 








Met 


Pro 




VjX u 


1 yr 


lie 


Thr 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lvs 


His 


Lvs 




c, 1 n 
O J u 








535 










540 










Thr 


Leu 


MJL a. 


Le u 


X X c 


T. t/ 
Lj y o 


a 


vjx y 


Arg 


Val 


lie 


Glv 


Glv 

W -I. jr 


lie 


Cvs 


Phe 


C A 

DHD 










S SO 

J J 










555 










560 


Arg 


Me t 


Phe 


Pro 


<^ *a r- 

O C X 


wX i I 


Gl y 


Phe 


Thr 


Glu 


lie 


Val 


Phe 


Cvs 


Aia 


Val 








S S 

>J VJ -J 










57 0 










575 




Thr 


Ser 


Asn 


Gi u 
580 


vjxn 


V ax 


Lys 


ox y 


1 yr 
585 


vjx y 


Thr 


His 


Leu 


Met 
590 


Asn 


His 


Leu 


Lys 


CjXU 


Tyr 


n X ^ 


X X c 


Xiy h3 


Hi s 


Asp 


lie 


Leu 


Asn 


Phe 


Leu 


Thr 


Tvr 




S Q S 

J J 










600 










605 








Ala 


Asp 


U 


Tyr 


rt-X d 


lie 


Gl y 


i y X 


Phe 


Lys 


Lys 


Gin 


Glv 

J. 


Phe 


Ser 


Lvs 




610 








615 










620 










Glu 


1 X e 


Lys 


Tip 

X -L CS 


P ro 


Lys 


Thr 


Lys 


Tvr 

J. y j- 


Val 


Glv 


Tvr 


lie 


Lvs 


Asp 


Tyr 


O <i .J 








630 










635 










640 


Glu 


Gx y 


>\X a 


i n X 


ijC Li 


Me t 


Gly 


Cys 


Glu 


Leu 


As n 


Pro 


Ara 


lie 


Pro 


Tvr 








645 










650 










655 




Thr 


Glu 


Phe 


Ser 
660 


V dX 


X X t= 


X X c 


Xiy o 


X( y o 

665 


Gin 


Lys 


Glu 


lie 


lie 
670 


Lys 


Lys 


Leu 


lie 


("111 
tjX u 


Arg 


Lys 


U7X 1 1 


ai 

r^X a 


Gin 


lie 


Arg 


Lys 


Vai 


Tvr 


Pro 


Glv 


Leu 






(=;7 s 

o / ^ 








68 0 










685 








Ser 


Cys 

tf; Q n 
D y u 


Phe 


Lys 


Asp 


\jx y 


V dX 

695 




Gin 


lie 


Pro 


lie 
700 


Giu 


Ser 


lie 


Pro 


Gly 


lie 


Arg 


VjyXU 


i n X 


vjx y 


i rp 


Lys 


P r o 


Ser 


Gly 


Lys 


Giu 


Lvs 
j_i 


Ser 


Lvs 


T n Q 








710 










715 










720 


Glu 


Pro 


Arg 


Asp 


P r o 


As p 


VjX 11 


Leu 


1 y X 


Ser 


Thr 


Leu 


Lys 


Ser 


lie 


Leu 








7 ^ s 










7 30 










735 




Gin 


Gin 


Val 


Lys 
nan 


Ser 


rlx s 


oxn 


Ser 


MX. d 

7 4 S 


i rp 


Pro 


Phe 


Met 


Giu 
750 


Pro 


Val 


Lys 


Arg 


Thr 


LtX U 


MX a 


Pro 


vjx y 


1 yr 


i yr 


Glu 


Vai 


lie 


Arg 


Ser 


Pro 


Met 


755 










760 










-765 








Asp 


Leu 


Lys 


Thr 


Met 


Ser 


Glu 


Arg 


Leu 


Lys 


Asn 


Arg 


Tyr 


Tyr 


Val 


Ser 


770 








775 










780 










Lys 


Lys 


Leu 


Phe 


Met 


Ala 


Asp 


Leu 


Gin 


Arg 


Val 


Phe 


Thr 


Asn 


Cys 


Lys 


785 








790 










795 










800 


Glu 


Tyr 


Asn 


Ala 


Pro 


Glu 


Ser 


Glu 


Tyr 


Tyr 


Lys 


Cys 


Aia 


Asn 


lie 


Leu 








805 










810 










815 




Glu 


Lys 


Phe 


Phe 


Phe 


Ser 


Lys 


lie 


Lys 


Giu 


Aia 


Gly 


Leu 


lie 


Asp 


Lys 






820 










825 










830 







(2) INFORMATION FOR SEQ ID NO : 2 : 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 481 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Leu Giu Giu Giu Val Tyr Ser Gin Asn Ser Pro lie Trp Asp Gin 

5 10 15 

Phe Leu Ser Ala Ser Ser Arg Thr Ser Gin Leu Gly lie Gin Thr 

20 25 30 

lie Asn Pro Pro Pro Vai Aia Gly Thr lie Ser Tyr Asn Ser Thr 
35 40 45 



Met 
1 

Asp 
Val 
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Ser 


Ser 
50 


Ser 


Leu 


Giu 


Gin 


Pro 
55 


Asn 


Aia 


Gly 


Ser 


Ser 
60 


Ser 


Pro 


Aia 


Cys 


Lys 


Ala 


Ser 


Ser 


Gly 


Leu 


Giu 


Ala 


Asn 


Pro 


Gly 


Giu 


Lys 


Arg 


Lys 


Met 


65 










70 










75 










80 


Thr 


Asp 


Ser 


His 


Val 
85 


Leu 


Giu 


Giu 


Aia 


Lys 
90 


Lys 


Pro 


Arg 


Val 


Met 
95 


Gly 


Asp 


lie 


Pro 


Met 


Giu 


Leu 


lie 


Asn 


Giu 


Vai 


Met 


Ser 


Thr 


lie 


Thr 


Asp 






100 










105 










110 






Pro 


Ala 


Ala 
115 


Met 


Leu 


Gly 


Pro 


Giu 

120 


Thr 


Asn 


Phe 


Leu 


Ser 
125 


Ala 


His 


Ser 


Ala 


Arg 
130 


Asp 


Giu 


Ala 


Ala 


Arg 
135 


Leu 


Giu 


Giu 


Arg 


Arg 
140 


Gly 


Val 


lie 


Giu 


Phe 


His 


Val 


Val 


Gly 


Asn 


Ser 


Leu 


Asn 


Gin 


Lys 


Pro 


Asn 


Lys 


Lys 


lie 


145 










150 










155 










160 


Leu 


Met 


Trp 


Leu 


Val 

165 


Gly 


Leu 


Gin 


Asn 


Val 
170 


Phe 


Ser 


His 


Gin 


Leu 
175 


Pro 


Arg 


Met 


Pro 


Lys 


Giu 


Tyr 


lie 


Thr 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 






180 










185 










190 






Lys 


Thr 


Leu 


Ala 


Leu 


lie 


Lys 


Asp 


Gly 


Arg 


Val 


lie 


Gly 


Gly 


lie 


Cys 




195 










200 










205 








Phe 


Arg 

210 


Met 


Phe 


Pro 


Ser 


Gin 
215 


Gly 


Phe 


Thr 


Giu 


lie 
220 


Val 


Phe 


Cys 


Ala 


Val 


Thr 


Ser 


Asn 


Giu 


Gin 


Val 


Lys 


Gly 


Tyr 


Gly 


Thr 


His 


Leu 


Met 


Asn 


225 










230 










235 










240 


His 


Leu 


Lys 


Giu 


Tyr 


His 


lie 


Lys 


His 


Asp 


lie 


Leu 


Asn 


Phe 


Leu 


Thr 








245 










250 










255 




Tyr 


Ala 


Asp 


Giu 


Tyr 


Ala 


lie 


Gly 


Tyr 


Phe 


Lys 


Lys 


Gin 


Gly 


Phe 


Ser 






260 










265 










270 






Lys 


Giu 


lie 


Lys 


lie 


Pro 


Lys 


Thr 


Lys 


Tyr 


Val 


Gly 


Tyr 


lie 


Lys 


Asp 




275 










280 










285 








Tyr 


Giu 


Gly 


Ala 


Thr 


Leu 


Met 


Gly 


Cys 


Giu 


Leu 


Asn 


Pro 


Arg 


lie 


Pro 


290 










295 










300 










Tyr 


Thr 


Giu 


Phe 


Ser 


Val 


lie 


lie 


Lys 


Lys 


Gin 


Lys 


Giu 


lie 


lie 


Lys 


305 










310 










315 










320 


Lys 


Leu 


lie 


Giu 


Arg 


Lys 


Gin 


Ala 


Gin 


lie 


Arg 


Lys 


Val 


Tyr 


Pro 


Gly 








325 










330 










335 




Leu 


Ser 


Cys 


Phe 
340 


Lys 


Asp 


Gly 


Val 


Arg 
345 


Gin 


lie 


Pro 


lie 


Giu 
350 


Ser 


lie 


Pro 


Gly 


lie 


Arg 


Giu 


Thr 


Gly 


Trp 


Lys 


Pro 


Ser 


Gly 


Lys 


Giu 


Lys 


Ser 




355 










360 










365 








Lys 


Giu 
370 


Pro 


Arg 


Asp 


Pro 


Asp 
375 


Gin 


Leu 


Tyr 


Ser 


Thr 
380 


Leu 


Lys 


Ser 


lie 


Leu 


Gin 


Gin 


Val 


Lys 


Ser 


His 


Gin 


Ser 


Aia 


Trp 


Pro 


Phe 


Met 


Giu 


Pro 


385 










390 










395 










400 


Val 


Lys 


Arg 


Thr 


Giu 


Ala 


Pro 


Gly 


Tyr 


Tyr 


Giu 


Val 


lie 


Arg 


Ser 


Pro 








405 










410 










415 




Met 


Asp 


Leu 


Lys 
420 


Thr 


Met 


Ser 


Giu 


Arg 
425 


Leu 


Lys 


Asn 


Arg 


Tyr 
430 


Tyr 


Val 


Ser 


Lys 


Lys 


Leu 


Phe 


Met 


Aia 


Asp 


Leu 


Gin 


Arg 


Val 


Phe 


Thr 


Asn 


Cys 




435 










440 










445 








Lys 


Giu 


Tyr 


Asn 


Ala 


Pro 


Giu 


Ser 


Giu 


Tyr 


Tyr 


Lys 


Cys 


Ala 


Asn 


lie 


450 










455 










460 










Leu 


Giu 


Lys 


Phe 


Phe 


Phe 


Ser 


Lys 


lie 


Lys 


Giu 


Ala 


Gly 


Leu 


lie 


Asp 


465 










470 










475 










480 



Lys 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 203 amino acids 

(B) TYPE: amino acid 

(C) STRAKDEDNESS : single 
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(D) TOPOLOGY: linear" 
(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 





V d X 


Val 


Gin 


His 


Thr 


Ly s 


Glv 


Cys 


Lys 


Ara 


Lvs 


Thr 


Asn 


Giy 


Giy 










5 










10 










15 




Cy s 


tr X 


lie 


Cy s 


Ly s 


Gin 


Leu 


lie 


Ala 


Leu 


Cvs 


Cvs 


Tyr 


His 


Ala 


Lys 
















25 










30 






liX s 


c y s 


Gin 


Glu 


Asn 


Ly s 


Cy s 


P ro 


Val 


Pro 


Phe 


Cvs 


Leu 


Asn 


He 


Lys 














4 0 










45 








Gin 


Tire? 

Liy S 


Leu. 




ox 1 1 


Gin 


Gin 


Leu 


Gin 


His 


Arg 


Leu 


Gin 


Gin 


Ala 


Gin 




o yj 










55 










60 












Leii 


Arg 




Arg 


Met 


Ala 


Ser 


Met 


Arg 


Thr 


Glv 


Val 


Val 


Giy 


Gin 












70 










75 










80 


Gin 


Gin 


Giy 


Leu 


Pro 


Ser 


Pro 


Thr 


Pro 


Ala 


Thr 


Pro 


Thr 


Thr 


Pro 


Thr 








85 










90 










95 




Giy 


Gin 


Gin 


Pro 


Thr 


Thr 


Pro 


Gin 


Thr 


Pro 


Gin 


Pro 


Thr 


Ser 


Gin 


Pro 






100 










105 










110 






Gin 


Pro 


Thr 


Pro 


Pro 


Asn 


Ser 


Met 


Pro 


Pro 


Tyr 


Leu 


Pro 


Arg 


Thr 


Gin 






115 










120 










125 








Ala 


Ala 


Giy 


Pro 


Val 


Ser 


Gin 


Giy 


Lys 


Ala 


Ala 


Giy 


Gin 


Val 


Thr 


Pro 




130 








135 










140 










Pro 


Thr 


Pro 


Pro 


Gin 


Thr 


Ala 


Gin 


Pro 


Pro 


Leu 


Pro 


Giy 


Pro 


Pro 


Pro 


145 










150 










155 










160 


Thr 


Ala 


Val 


Glu 


Met 


Ala 


Met 


Gin 


He 


Gin 


Arg 


Ala 


Ala 


Glu 


Thr 


Gin 










165 










170 










175 




Arg 


Gin 


Met 


Ala 


His 


Val 


Gin 


lie 


Phe 


Gin 


Arg 


Pro 


He 


Gin 


His 


Gin 






180 










185 










190 






Met 


Pro 


Pro 


Met 


Thr 


Pro 


Met 


Ala 


Pro 


Met 


Giy 













195 200 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 351 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 





(xi) SEQUENCE 


DESCRIPTION: 


SEQ 


! ID 


NO: 4 












Met 


Ser 


Glu 


Ala 


Giy 


Giy 


Ala 


Giy 


Pro 


Giy 


Giy 


Cys 


Giy 


Ala 


Giy 


Ala 


1, 








5 










10 










15 




Giy 


Ala 


Giy 


Ala 


Giy 


Pro 


Giy 


Ala 


Leu 


Pro 


Pro 


Gin 


Pro 


Ala 


Ala 


Leu 




20 










25 










30 






Pro 


Pro 


Ala 


Pro 


Pro 


Gin 


Giy 


Ser 


Pro 


Cys 


Ala 


Ala 


Ala 


Ala 


Giy' 


Giy 






35 










40 










45 








Ser 


Giy 


Ala 


Cys 


Giy 


Pro 


Ala 


Thr 


Ala 


Val 


Ala 


Ala 


Ala 


Giy 


Thr 


Ala 




50 








^55 










60 










Glu 


Giy 


Pro 


Giy 


Giy 


Giy 


Giy 


Ser 


Ala 


Arg 


He 


Ala 


val 


Lys 


Lys 


Ala 


65 








70 










75 










80 


Gin 


Leu 


Arg 


Ser 


Ala 


Pro 


Arg 


Ala 


Lys 


Lys 


Leu 


Glu 


Lys 


Leu 


Giy 


Val 








85 










90 










95 




Tyr 


Ser 


Ala 


Cys 


Lys 


Ala 


Glu 


Glu 


Ser 


Cys 


Lys 


Cys 


Asn 


Giy 


Trp 


Lys 






100 










105 










110 






Asn 


Pro 


Asn 


Pro 


Ser 


Pro 


Thr 


Pro 


Pro 


Arg 


Ala 


Asp 


Leu 


Gin 


Gin 


He 






115 










120 










125 








He 


Val 


Ser 


Leu 


Thr 


Glu 


Ser 


Cys 


Arg 


Ser 


Cys 


Ser 


His 


Ala 


Leu 


Ala 




130 










135 










140 
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Ala 


His 


Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 


Glu 


Glu 


Glu 


Met 


Asn 


Arg 


145 










150 










155 










160 


Leu 


Leu 


Gly 


lie 


Val 


Leu 


Asp 


Val 


Glu 


Tyr 


Leu 


Phe 


Thr 


Cys 


Val 


His 








165 










170 










175 




Lys 


Glu 


Glu 


Asp 


Ala 


Asp 


Thr 


Lys 


Gin 


Val 


Tyr 


Phe 


Tyr 


Leu 


Phe 


Lys 






180 










185 










190 






Leu 


Leu 


Arg 


Lys 


Ser 


He 


Leu 


Gin 


Arg 


Gly 


Lys 


Pro 


Val 


Val 


Glu 


Gly 






195 










200 










205 








Ser 


Leu 


Glu 


Lys 


Lys 


Pro 


Pro 


Phe 


Glu 


Lys 


Pro 


Ser 


He 


Glu 


Gin 


Gly 




210 










215 










220 










val 


Asn 


Asn 


Phe 


Val 


Gin 


Tyr 


Lys 


Phe 


Ser 


His 


Leu 


Pro 


Ala 


Lys 


Glu 


225 










230 










235 










240 


Arg 


Gin 


Thr 


He 


Val 


Glu 


Leu 


Ala 


Lys 


Met 


Phe 


Leu 


Asn 


Arg 


He 


Asn 








245 










250 










255 




Tyr 


Trp 


His 


Leu 


Glu 


Ala 


Pro 


Ser 


Gin 


Arg 


Arg 


Leu 


Arg 


Ser 


Pro 


Asn 




260 










265 










270 






Asp 


Asp 


lie 


Ser 


Gly 


Tyr 


Lys 


Glu 


Asn 


Tyr 


Thr 


Arg 


Trp 


Leu 


Cys 


Tyr 




275 










280 










285 








Cys 


Asn 


Val 


Pro 


Gin 


Phe 


Cys 


Asp 


Ser 


Leu 


Pro 


Arg 


Tyr 


Glu 


Thr 


Thr 


290 










295 










300 










Gin 


Val 


Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 


Val 


Phe 


Thr 


Val 


Met 


Arg 


305 








310 










315 










320 


Arg 


Gin 


Leu 


Leu 


Glu 


Gin 


Ala 


Arg 


Gin 


Glu 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 








325 










330 










335 




Glu 


Lys 


Arg 


Thr 


Leu 


He 


Leu 


Thr 


His 


Phe 


Pro 


Lys 


Phe 


Leu 


Ser 





340 345 350 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



Met 


Leu 


Glu 


Glu 


Glu 


He 


Tyr 


Gly 


Ala 


Asn 


Ser 


Pro 


lie 


Trp 


Glu 


Ser 


1 








5 










10 










15- 




Gly 


Phe 


Thr 


Met 


Pro 


Pro 


Ser 


Glu 


Gly 


Thr 


Gin 


Leu 


Val 


Pro 


Arg 


Pro 






20 










25 










30 






Ala 


Ser 


Val 


Ser 


Ala 


Ala 


Val 


Val 


Pro 


Ser 


Thr 


Pro 


He 


Phe 


Ser 


Pro 






35 










40 










45 








Ser 


Met 


Gly 


Gly 


Gly 


Ser 


Asn 


Ser 


Ser 


Leu 


Ser 


Leu 


Asp 


Ser 


Ala 


Gly 




50 






55 










60 










Ala 


Glu 


Pro 


Met 


Pro 


Gly 


Glu 


Lys 


Arg 


Thr 


Leu 


Pro 


Glu 


Asn 


Leu 


Thr 


65 










70 










75 










80 


Leu 


Glu 


Asp 


Ala 


Lys 


Arg 


Leu 


Arg 


Val 


Met 


Gly 


Asp 


He 


Pro 


Met 


Glu 










85 










90 










95 




Leu 


Val 


Asn 


Glu 


Val 


Met 


Leu 


Thr 


He 


Thr 


Asp 


Pro 


Ala 


Ala 


Met 


Leu 








100 










105 










110 






Gly 


Pro 


Glu 


Thr 


Ser 


Leu 


Leu 


Ser 


Ala 


Asn 


Ala 


Ala 


Arg 


Asp 


Glu 


Thr 




115 










120 










125 








Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


Gly 


He 


He 


Glu 


Phe 


His 


Val 


He 


Gly 




130 










135 










140 










Asn 


Ser 


Leu 


Thr 


Pro 


Lys 


Ala 


Asn 


Arg 


Arg 


Val 


Leu 


Leu 


Trp 


Leu 


Val 


145 










150 










155 










160 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 


Glu 








165 










170 










175 




Tyr 


He 


Ala 


Arg 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 


Lys 


Thr 


Leu 


Ala 


Leu 






180 










185 










190 
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lie 


Ly s 


ASD 


Gl V 


Ara 


Vai 


lie 


Giy 


Giy 


lie 


Cys 


Phe 


Arg 


Met 


Phe 


Pro 




195 










200 










205 








Thr 


Gin 


Gl V 


Phe 


Thr 


Giu 


lie 


Vai 


Phe 


Cys 


Aia 


Vai 


Thr 


Ser 


Asn 


Giu 




210 










215 










220 










Gin 


Vai 


Ly s 


Gl V 


Tvr 


Glv 


Thr 


His 


Leu 


Met 


Asn 


His 


Leu 


Lys 


Giu 


Tyr 










230 










235 










240 


His 


lie 


Lys 


His 


Asn 


lie 


Leu 


Tvr 


Phe 


Leu 


Thr 


Tvr 


Ala 


Asp 


Giu 


Tyr 








245 










250 










255 




r\X a 


lie 


Giy 


T vr 
X y X. 


Phe 


Lys 


Lys 


Gin 


Glv 


Phe 


Ser 


Lvs 


Asp 


lie 


Lvs 


Vai 








2 60 










265 










270 






Pr'o 




S e r 


Ar^ 


X y i- 


Leu 


Gl V 

vji y 


Tvr 

X y J- 


lie 


L vs 


ASD 


Tvr 


Giu 


Glv 


Ala 


Thr 




27 5 










280 










285 










Me t 


Giu 


Cys 


Giu 


Leu 


Asn 


Pro 


Ara 


lie 


Pro 


Tyr 


Thr 


Giu 


Leu 


Ser 




2 9 0 










295 










300 










Hie; 

n X ^ 


lie 


lie 


Lys 


Lys 


Gin 


Lys 


Giu 


lie 


lie 


Lvs 


Lys 


Leu 


lie 


Giu 


Arg 


305 










310 










315 










320 


T \/c: 
j-i y t) 


Gl n 


Aia 


Gin 


lie 


Arg 


Lys 


Vai 


Tvr 

J. y j- 


Pro 


Glv 


Leu 


Ser 


C vs 


Phe 


Lvs 








32 5 










330 










335 




("111 




Vai 


Ar 5 


Gin 


lie 


Pro 


Vai 


Giu 


Ser 


Vai 


Pro 


Glv 

w _i- jr 


lie 


Arq 


Giu 






340 










345 










350 






1 III. 


y 




Lys 


Pro 


Leu 


Gi V 


Lys 


Giu 


Lvs 


Glv 


Lvs 


Giu 


Leu 


Lvs 


Asp 




355 










360 










365 








Pro 




Gin 


Leu 


Tvr 
X y X. 


Thr 


Thr 


Leu 


Lys 


Asn 


Leu 


Leu 


. Aia 


Gin 


lie 


Lvs 




37 0 










375 










380 










S e r 


His 


Pro 


Ser 


Aia 


Tro 


Pro 


Phe 


Met 


Giu 


Pro 


Vai 


Lys 


Lys 


Ser 


Giu 


385 










390 










395 










400 


Aia 


Pro 


Asp 


T vr 


Tvr 


Giu 


Vai 


lie 


Arg 


Phe 


Pro 


lie 


Asp 


Leu 


Lys 


Thr 










405 










410 










415 




Met 


Thr 


Giu 


Arg 


Leu 


Arg 


Ser 


Arg 


Tyr 


Tyr 


Vai 


Thr 


Arg 


Lys 


Leu 


Phe 








420 










425 










430 






Vai 


Aia 


Asp 


Leu 


Gin 


Arg 


Vai 


lie 


Aia 


Asn 


Cys 


Arg 


Giu 


Tyr 


Asn 


Pro 






435 










440 










445 








Pro 


Asp 


Ser 


Giu 


Tyr 


Cys 


Arg 


Cys 


Aia 


Ser 


Aia 


Leu 


Giu 


Lys 


Phe 


Phe 




450 










455 










460 










Tyr 


Phe 


Lys 


Leu 


Lys 


Giu 


Giy 


Giy 


Leu 


lie 


Asp 


Lys 










465 










470 










475 













(2) INFORMATION FOR SEQ ID NO : 6 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2414 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



Met 


Ala 


Giu 


Asn 


Vai 


Vai- 


Giu 


Pro 


Giy 


Pro 


Pro 


Ser 


Ala 


Lys 


Arg 


Pro 


1 








5 










10 










15 




Lys 


Leu 


Ser 


Ser 


Pro 


Aia 


Leu 


Ser 


Ala 


Ser 


Aia 


Ser 


Asp 


Giy 


Thr 


Asp 






20 










25 










30 






Phe 


Giy 


Ser 


Leu 


Phe 


Asp 


Leu 


Giu 


His 


Asp 


Leu 


Pro 


Asp 


Giu 


Leu 


lie 




35 










40 










45 








Asn 


Ser 


Thr 


Giu 


Leu 


Giy 


Leu 


Thr 


Asn 


Giy 


Giy 


Asp 


lie 


Asn 


Gin 


Leu 




50 










55 










60 










Gin 


Thr 


Ser 


Leu 


Giy 


Met 


Vai 


Gin 


Asp 


Aia 


Ala 


Ser 


Lys 


His 


Lys 


Gin 


65 










70 










75 










80 


Leu 


Ser 


Giu 


Leu 


Leu 


Arg 


Ser 


Giy 


Ser 


Ser 


Pro 


Asn 


Leu 


Asn 


Met 


Giy 










85 










90 










95 




Vai 


Giy 


Giy 


Pro 


Giy 


Gin 


Vai 


Met 


Aia 


Ser 


Gin 


Aia 


Gin 


Gin 


Ser 


Ser 




100 










105 










110 
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Pro 


Gly 


Leu 


Gly 


Leu 


He 


Asn 


Ser 


Met 






115 










120 




Ala 


Gly 


Leu 


Thr 


Ser 


Pro 


Asn 


Met 


Gly 




130 










135 






Gin 


Gly 


Pro 


Thr 


Gin 


Ser. 


Thr 


Gly 


Met 


145 










150 








Pro 


Ala 


Met 


Gly 


Met 


Asn 


Thr 


Gly 


Thr 










165 










Met 


Leu 


Ala 


Ala 


Gly 


Asn 


Gly 


Gin 


Gly 








180 










185 


Asn 


Gly 


Ser 


He 


Gly 


Ala 


Gly Arg 


Gly 






195 










200 




Asn 


Pro 


Gly 


Met 


Gly 


Ser 


Ala 


Gly 


Asn 




210 










215 






Gin 


Gly 


Ser 


Pro 


Gin 


Met 


Gly 


Gly 


Gin 


225 










230 








Pro 


Leu 


Lys 


Met 


Gly 


Met 


Met 


Asn 


Asn 










245 










Tyr 


Thr 


Gin 


Asn 


Pro 


Gly 


Gin 


Gin 


He 






260 










265 


Gin 


lie 


Gin 


Thr 


Lys 


Thr 


Val 


Leu 


Ser 






275 










280 




Met 


Asp 


Lys 


Lys 


Ala 


Val 


Pro 


Gly 


Gly 




290 










295 






Gin 


Pro 


Ala 


Pro 


Gin 


Val 


Gin 


Gin 


Pro 


305 










310 








Gin 


Gly 


Met 


Gly 


Ser 


Gly 


Ala 


His 


Thr 










325 










Leu 


He 


Gin 


Gin 


Gin 


Leu 


Val 


Leu 


Leu 








340 










345 


Arg 


Arg 


Glu 


Gin 


Ala 


Asn 


Gly 


Glu 


Val 






355 










360 




Cys 


Arg 


Thr 


Met 


Lys 


Asn 


Val 


Leu 


Asn 


370 










375 






Gly 


Lys 


Ser 


Cys 


Gin 


Val 


Ala 


His 


Cys 


385 










390 








Ser 


Hi^ 


Trp 


Lys 


Asn 


Cys 


Thr 


Arg 


His 










405 










Leu 


Lys 


Asn 


Ala 


Gly 


Asp 


Lys 


Arg 


Asn 








420 










425 


Ala 


Pro 


Val 


Gly 


Leu 


Gly 


Asn 


Pro 


Ser 






435 










440 




Ser 


Ala 


Pro 


Asn 


Leu 


Ser 


Thr 


Val 


Ser 




450 










455 






Glu 


Arg 


Ala 


Tyr 


Ala 


Ala 


Leu 


Gly 


Leu 


465 










470 








Pro 


Thr 


Gin 


Pro 


Gin 


Val 


Gin 


Ala 


Lys 










485 










Gly 


Gin 


Ser 


Pro 


Gin 


Gly 


Met 


Arg 


Pro 








500 










505 


Pro 


Met 


Gly 


Val 


Asn 


Gly 


Gly 


Val 


Gly 






515 










520 




Ser 


Asp 


Ser 


Met 


Leu 


His 


Ser 


Ala 


He 




530 










535 






Ser 


Glu 


Asn 


Ala 


Ser 


Val 


Pro 


Ser 


Leu 


545 










550 








Gin 


Pro 


Ser 


Thr 


Thr 


Gly 


He 


Arg 


Lys 










565 










Gin 


Asp 


Leu 


Arg 


Asn 


His 


Leu 


Val 


His 








580 










585 


Pro 


Thr 


Pro 


Asp 


Pro 


Ala 


Ala 


Leu 


Lys 






595 










600 
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77 



Val 


Lys 


Ser 


Pro 


Met 


Thr 


Gin 








125 








Met 


Gly 


Thr 


Ser 


Gly 


Pro 


Asn 






140 










Met 


Asn 


Ser 


Pro 


Val 


Asn 


Gin 




155 










160 


Asn 


Ala 


Gly 


Met 


Asn 


Pro 


Gly 


170 










175 




He 


Met 


Pro 


Asn 


Gin 


Val 


Met 










190 






Arg 


Gin 


Asp 


Met 


Gin 


Tyr 


Pro 








205 








Leu 


Leu 


Thr 


Glu 


Pro 


Leu 


Gin 






220 










Thr 


Gly 


Leu 


Arg 


Gly 


Pro 


Gin 




235 










240 


Pro 


Asn 


Pro 


Tyr 


Gly 


Ser 


Pro 


250 










255 




Gly 


Ala 


Ser 


Gly 


Leu 


Gly 


Leu 










270 






Asn 


Asn 


Leu 


Ser 


Pro 


Phe 


Ala 








285 








Gly 


Met 


Pro 


Asn 


Met 


Gly 


Gin 






300 










Gly 


Leu 


Val 


Thr 


Pro 


Val 


Ala 




315 










320 


Ala 


Asp 


Pro 


Glu 


Lys 


Arg 


Lys 


330 










335 




Leu 


His 


Ala 


His 


Lys 


Cys 


Gin 










350 






Arg 


Gin 


Cys 


Asn 


Leu 


Pro 


His 








365 








His 


Met 


Thr 


His 


Cys 


Gin 


Ser 






380 










Ala 


Ser 


Ser 


Arg 


Gin 


He 


He 




395 










400 


Asp 


Cys 


Pro 


Val 


Cys 


Leu 


Pro 


410 










415 




Gin 


Gin 


Pro 


He 


Leu 


Thr 


Gly 










430 






Ser 


Leu 


Gly 


Val' 


Gly 


Gin 


Gin 








445 








Gin 


He 


Asp 


Pro 


Ser 


Ser 


He 






460 










Pro 


Tyr 


Gin 


Val 


Asn 


Gin 


Met 




475 










480 


Asn 


Gin 


Gin 


Asn 


Gin 


Gin 


Pro 


490 










495 




Met 


Ser 


Asn 


Met 


Ser 


Ala 


Ser 










510 






Val 


Gin 


Thr 


Pro 


Ser 


Leu 


Leu 








525 








Asn 


Ser 


Gin 


Asn 


Pro 


Met 


Met 






540 










Gly 


Pro 


Met 


Pro 


Thr 


Ala 


Ala 




555 










560 


Gin 


Trp 


His 


Glu 


Asp 


He 


Thr 


570 










575 




Lys 


Leu 


Val 


Gin 


Ala 


He 


Phe 










590 






Asp 


Arg 


Arg 


Met 


Glu 


Asn 


Leu 



605 
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val Ala Tyr Ala Arg Lys Val Glu Gly Asp Met Tyr Glu Ser Ala Asn 

610 615 620 

Asn Arg Ala Glu Tyr Tyr His Leu Leu Ala Glu Lys lie Tyr Lys lie 
625 630 635 640 

Gin Lys Glu Leu Glu Glu Lys Arg Arg Thr Arg Leu Gin Lys Gin Asn 

645 650 655 

Met Leu Pro Asn Ala Ala Gly Met Val Pro Val Ser Met Asn Pro Gly 

660 665 670 

Pro Asn Met Gly Gin Pro Gin Pro Gly Met Thr Ser Asn Gly Pro Leu 

675 680 685 

Pro Asp Pro' Ser Met lie Arg Gly Ser Val Pro Asn Gin Met Met- Pro 

690 695 700 

Arg lie Thr Pro Gin Ser Gly Leu Asn Gin Phe Gly Gin Met Ser Met 
705 710 . 715 720 

Ala Gin Pro Pro lie Val Pro Arg Gin Thr Pro Pro Leu Gin His His 

725 730 735 

Gly Gin Leu Ala Gin Pro Gly Ala Leu Asn Pro Pro Met Gly Tyr Gly 

740 745 750 

Pro Arg Met Gin Gin Pro Ser Asn Gin Gly Gin Phe Leu Pro Gin Thr 

755 760 765 

Gin Phe Pro Ser Gin Gly Met Asn Val Thr Asn lie Pro Leu Ala Pro 

770 775 780 

Ser Ser Gly Gin Ala Pro Val Ser Gin Ala Gin Met Ser Ser Ser Ser 
785 790 795 800 

Cys Pro Val Asn Ser Pro lie Met Pro Pro Gly Ser Gin Gly Ser His 

805 810 815 

lie His Cys Pro Gin Leu Pro Gin Pro Ala Leu His Gin Asn Ser Pro 

820 825 830 

Ser Pro Val Pro Ser Arg Thr Pro Thr Pro His His Thr Pro Pro Ser 

835 840 845 

lie Gly Ala Gin Gin Pro Pro Ala Thr Thr lie Pro Ala Pro Val Pro 

850 855 860 

Thr Pro Pro Ala Met Pro Pro Gly Pro Gin Ser Gin Ala Leu His Pro 
865 870 875 880 

Pro Pro Arg Gin Thr Pro Thr Pro Pro Thr Thr Gin Leu Pro Gin Gin 

885 890 395 

Val Gln^iPro Ser Leu Pro Ala Ala Pro Ser Ala Asp Gin Pro Gin Gin 

900 905 910 

Gin Pro Arg Ser Gin Gin Ser Thr Ala Ala Ser Val Pro Thr Pro Asn 

915 920 925 

Ala Pro Leu Leu Pro Pro Gin Pro Ala Thr Pro Leu Ser Gin Pro Ala 

930 935 940 

Val Ser He Glu Gly Gin Val Ser Asn Pro Pro Ser Thr Ser Ser Thr 
945 950 955 960 

Glu Val Asn Ser Gin Ala He Ala Glu Lys Gin Pro Ser Gin Glu Val 

965 970 975 

Lys Met Glu Ala Lys Met Glu Val Asp Gin Pro Glu Pro Ala Asp Thr 

980 985 990 

Gin Pro Glu Asp lie Ser Glu Ser Lys Val Glu Asp Cys Lys Met Glu 

995 1000 1005 

Ser Thr Glu Thr Glu Glu Arg Ser Thr Glu Leu Lys Thr Glu He Lys 

1010 1015 1020 

Glu Glu Glu Asp Gin Pro Ser Thr Ser Ala Thr Gin Ser Ser Pro Ala 
025 1030 1035 1040 

Pro Gly Gin Ser Lys Lys Lys He Phe Lys Pro Glu Glu Leu Arg Gin 

1045 1050 1055 

Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg Gin Asp Pro Glu Ser 

1060 1065 1070 

Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu Leu Gly lie Pro Asp 

1075 1080 1085 

Tyr Phe Asp He Val Lys Ser Pro Met Asp Leu Ser Thr He Lys Arg 
1090 1095 1100 
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Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro Trp Gin Tyr Val Asp Asp 
105 1110 1115 1120' 

lie Trp Leu Met Phe Asn Asn Ala Trp Leu Tyr Asn Arg Lys Thr Ser 

1125 1130 1135 

Arg Vai Tyr Lys Tyr Cys Ser Lys Leu Ser Giu Val Phe Glu Gin Giu 

1140 . 1145 1150 

lie Asp Pro Val Met Gin Ser Leu Gly Tyr Cys Cys Gly Arg Lys Leu 

1155 1160 1165 

Giu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly Lys Gin Leu Cys Thr 

1170 1175 1180 

lie Pro Arg Asp Ala Thr Tyr Tyr Ser Tyr Gin Asn Arg Tyr His Phe 
185 1190 1195 1200 

Cys Giu Lys Cys Phe Asn Giu lie Gin Gly Giu Ser Vai Ser Leu Gly 

1205 1210 ' 1215 

Asp Asp Pro Ser Gin Pro Gin Thr Thr lie Asn Lys Glu Gin Phe Ser 

1220 1225 1230 

Lys Arg Lys Asn Asp Thr Leu Asp Pro Glu Leu Phe Vai Glu Cys Thr 

1235 1240 1245 

Giu" Cys Gly Arg Lys Met His Gin lie Cys Vai Leu His His Glu lie 

1250 1255 1260 

lie Trp Pro Ala Gly Phe Vai Cys Asp Gly Cys Leu Lys Lys Ser Ala 
265 1270 1275 1280 

Arg Thr Arg Lys Glu Asn Lys Phe Ser Ala Lys Arg Leu Pro Ser Thr 

1285 1290 1295 

Arg Leu Gly Thr Phe Leu Giu Asn Arg Vai Asn Asp Phe Leu Arg Arg 

1300 1305 1310 

Gin Asn His Pro Giu Ser Giy Giu Val Thr Val Arg Val Vai His Ala 

1315 1320 1325 

Ser Asp Lys Thr Vai Giu Vai Lys Pro Giy Met Lys Ala Arg Phe Vai 

1330 1335 1340 

Asp Ser Gly Glu Met Aia Giu Ser Phe Pro Tyr Arg Thr Lys Ala Leu 
345 1350 1355 1360^ 

Phe Ala Phe Glu Giu lie Asp Giy Vai Asp Leu Cys Phe Phe Gly Met 

1365 1370 1375 

His Vai Gin Giu .Tyr Giy Ser Asp Cys Pro Pro Pro Asn Gin Arg Arg 

1380 1385 1390 

Val Tyr lie Ser Tyr Leu Asp Ser Vai His Phe Phe Arg Pro Lys Cys 

1395 1400 1405 

Leu Arg Thr Ala Val Tyr His Giu lie Leu lie Giy Tyr Leu Glu Tyr 

1410 1415 1420 

Vai Lys Lys Leu Giy Tyr Thr Thr Giy His lie Trp Aia Cys Pro Pro 
425 1430 1435 1440 

Ser Giu Giy Asp Asp Tyr lie Phe His Cys His Pro Pro Asp Gin Lys 

1445 1450 1455 

lie Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr Lys Lys Met Leu Asp 

1460 1465 1470 

Lys Ala Vai Ser Giu Arg lie Vai His Asp Tyr Lys Asp lie Phe Lys 

1475 1480 1485 

Gin Aia Thr Giu Asp Arg Leu Thr Ser Aia Lys Giu Leu Pro Tyr Phe 

1490 1495 1500 

Giu Giy Asp Phe Trp Pro Asn Vai Leu Giu Glu Ser lie Lys Giu Leu 
505 1510 1515 ' 1520 

Glu Gin Glu Giu Giu Giu Arg Lys Arg Giu Glu Asn Thr Ser Asn Glu 

1525 1530 1535 

Ser Thr Asp Vai Thr Lys Gly Asp Ser Lys Asn Ala Lys Lys Lys Asn 

1540 1545 1550 ' 

Asn Lys Lys Thr Ser Lys Asn Lys Ser Ser Leu Ser Arg Gly Asn Lys 

1555 1560 1565 

Lys Lys Pro Gly Met Pro Asn Vai Ser Asn Asp Leu Ser Gin Lys Leu 

1570 1575 1580 

Tyr Aia Thr Met .Glu Lys His Lys Glu Vai Phe Phe Vai He Arg Leu 
585 1590 1595 1600 
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He Ala Gly Pro Ala Ala Asn Ser Leu Pro Pro lie Val Asp Pro Asp 

1605 1610 1615 

Pro Leu He Pro Cys Asp Leu Met Asp Gly Arg Asp Ala Phe Leu Thr 

1620 1625 1630 

Leu Ala Arg Asp Lys His Leu Glu Phe Ser Ser Leu Arg Arg Ala Gin 

1635 1640 1645 

Trp Ser Thr Met Cys Met Leu Val Glu Leu His Thr Gin Ser Gin Asp 

1650 1655 1660 

Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys His His Val Glu Thr Arg 
665 1670 1675 1680 

Trp His Cys Thr Val Cys Glu Asp Tyr Asp Leu Cys He Thr Cys Tyr 

1685 1690 1695 

Asn Thr Lys Asn His Asp His Lys Met Glu Lys Leu Gly Leu Gly Leu 

1700 1705 1710 

Asp Asp Glu Ser Asn Asn Gin Gin Ala Ala Ala Thr Gin Ser Pro Giy 

1715 1720 1725 

Asp Ser Arg Arg Leu Ser He Gin Arg Cys He Gin Ser Leu Val His 

1730 1735 1740 

Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser Leu Pro Ser Cys Gin Lys 
745 1750 1755 1760 

Met Lys Arg Val Val Gin His Thr Lys Giy Cys Lys Arg Lys Thr Asn 

1765 1770 1775 

Gly Gly Cys Pro He Cys Lys Gin Leu He Ala Leu Cys Cys Tyr His 

1780 1785 1790 

Ala Lys His Cys Gin Glu Asn Lys Cys Pro Val Pro Phe Cys Leu Asn 

1795 • 1800 1805 

He Lys Gin Lys Leu Arg Gin Gin Gin Leu Gin His Arg Leu Gin Gin 

1810 1815 1820 

Ala Gin Met Leu Arg Arg Arg Met Ala Ser Met Gin Arg Thr Giy Val 
825 1830 1835 1840 

Val Giy Gin Gin Gin Giy Leu Pro Ser Pro Thr Pro Ala Thr Pro Thr 

1845 1850 1855 

Thr Pro Thr Giy Gin Gin Pro Thr Thr Pro Gin Thr Pro Gin Pro Thr 

I860 1865 1870 

Ser Gin Pro Gin Pro Thr Pro Pro Asn Ser Met Pro Pro Tyr Leu Pro 

1875 1880 1885 

Arg Thr Gin Ala Ala Giy Pro Val Ser Gin Gly Lys Ala Ala Gly Gin 

1890 1895 1900 

Val Thr Pro -Pro Thr Pro Pro Gin Thr Ala Gin Pro Pro Leu Pro Giy 
905 1910 1915 1920 

Pro Pro Pro Thr Ala Val Glu Met Ala Met Gin He Gin Arg Ala Ala 

1925 - 1930 ■ 1935 

Glu Thr Gin Arg Gin Met Ala His Val Gin He Phe Gin Arg Pro He 

1940 1945 1950 

Gin His Gin Met Pro Pro Met Thr Pro Met Ala Pro Met Giy Met Asn 

1955 I960 1965 

Pro Pro Pro Met Thr Arg Giy Pro Ser Gly His Leu Glu Pro Giy Met 

1970 1975 1980 

Giy Pro Thr Giy Met Gin Gin Gin Pro Pro Trp Ser Gin Gly Giy Leu 
985 1990 1995 2000 

Pro Gin Pro Gin Gin Leu Gin Ser Giy Met Pro Arg Pro Ala Met Met 

2005 2010 2015 

Ser Val Ala Gin His Gly Gin Pro Leu Asn Met Ala Pro Gin Pro Giy 

2020 2025 2030 

Leu Giy Gin Val Giy He Ser Pro Leu Lys Pro Giy Thr Val Ser Gin 

2035 2040 2045 

Gin Ala Leu Gin Asn Leu Leu Arg Thr Leu Arg Ser Pro Ser Ser Pro 

2050 2055 2060 . 

Leu Gin Gin Gin Gin Val Leu Ser He Leu His Ala Asn Pro Gin Leu 
065 2070 2075 ' 2080 

Leu Ala Ala Phe He Lys Gin Arg Ala Ala Lys Tyr Ala Asn Ser Asn 
2085 2090 2095. 
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Pro Gin Pro lie Pro' Gly Gin Pro Gly Met Pro* Gin Gly Gin Pro Gly 

2100 2105 2110 

Leu Gin Pro Pro Thr Met Pro Gly Gin Gin Gly Val His Ser Asn Pro 

2115 2120 2125 

Ala Met Gin Asn Met Asn Pro Met Gin Ala Gly Val Gin Arg Ala Gly 

2130 2135 2140 

Leu Pro Gin Gin Gin Pro Gin Gin Gin Leu Gin Pro Pro Met Gly, Gly 
145 2150 2155 2160 

Met Ser Pro Gin Ala Gin Gin Met Asn Met Asn His Asn Thr Met Pro 

2165 2170 2175 

Ser Gin Phe Arg Asp lie Leu Arg Arg Gin Glri Met Met Gin Gin Gin 

2180 2185 2190 

Gin Gin Gin Gly Ala Gly Pro Gly lie Gly Pro Gly Met Ala Asn His 

2195 2200 2205 

Asn Gin Phe Gin Gin Pro Gin Gly Val' Gly Tyr Pro Pro Gin Pro Gin 

2210 2215 2220 

Gin Arg Met Gin His His Met Gin Gin Met Gin Gin Gly Asn Met Gly 
225 2230 2235 2240 

Gin lie Gly Gin Leu Pro Gin Ala Leu Gly Ala Glu Ala Gly Ala Ser 

2245 2250 2255 

Leu Gin Ala Tyr Gin Gin Arg Leu Leu Gin Gin Gin Met Gly Ser Pro 

2260 2265 2270 

Val Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu Pro Asn Gin 

2275 2280 2285 

Ala Gin Ser Pro His Leu Gin Gly Gin Gin lie Pro Asn Ser Leu Ser 

2290 2295 2300 

Asn Gin Val Arg Ser Pro Gin Pro Val Pro Ser Pro Arg Pro Gin Ser 
305 2310 2315 2320 

Gin Pro Pro His Ser Ser Pro Ser Pro Arg Met Gin Pro Gin Pro Ser 

2325 2330 2335 

Pro His His Val Ser Pro Gin Thr Ser Ser Pro His Pro Gly Leu Val 

2340 2345 2350 

Ala Ala Gin Ala Asn Pro Met Glu Gin Gly His Phe Ala Ser Pro Asp 

2355 2360 2365 

Gin Asn Ser Met Leu Ser Gin Leu Ala Ser Asn Pro Gly Met Ala Asn 

2370 2375 2380 

Leu His Gly Ala Ser Ala Thr Asp Leu Gly Leu Ser Thr Asp Asn Ser 
385 2390 2395 ' 2400 

Asp Leu Asn Ser Asn Leu Ser Gin Ser Thr Leu Asp lie His 
2405 2410 2 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2441 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: None 





(Xi) SEQUENCE 


DESCRIPTION: 


SEQ 


! ID 


NO: 7 












Met 


Ala 


Glu 


Asn 


Leu 


Leu 


Asp 


Gly 


Pro 


Pro 


Asn 


Pro 


Lys 


Arg 


Ala 


Lys 


1 








5 










10 










15 




Leu 


Ser 


Ser 


Pro 


Gly 


Phe 


Ser 


Ala 


Asn 


Asp 


Asn 


Thr 


Asp 


Phe 


Gly 


Ser 








20 










25 










30 






Leu 


Phe 


Asp 


Leu 


Glu 


Asn 


Asp 


Leu 


Pro 


Asp 


Glu 


Leu 


lie 


Pro 


Asn 


Gly 






35 










40 










45 








Glu 


Leu 


Ser 


Leu 


Leu 


Asn 


Ser 


Gly 


Asn 


Leu 


Val 


Pro 


Asp 


Ala 


Ala 


Ser 




50 










■ 55 










.60 










Lys 


His 


Lys 


Gin 


Leu 


Ser 


Glu 


Leu 


Leu 


Arg 


Gly 


Gly 


Ser 


Gly 


Ser 


Ser 


65 










70 










75 










80 



wo 98/03652 




PCT/US97/12877 



82 



lie Asn Pro Glv lie Gly Asn Vai Ser Ala Ser Ser Pro Val Gin Gin 
'85 90 95 



(jj, y 


Leii 


<j±. y 


yjjL. y 


Gi n 


Aia 


r;i n 

ox 11 


Gi y 


Gin 


Pro 


Asn 


Ser 


Thr 


Asn 


Met 


Ala 






100 










105 










110 






C o T- 

o e i 


Le Li 


ox y 




Me t 


Gi y 




Ser 


Pro 


Leu 


As n 


Gin 


Gi V 


Asp 


Ser 


Ser 






115 










120 










125 








i nr 


Pro 


As n 


- 

Leu 


JT X 


Lys 


ril n 

ox 1 1 


al a 

r\X d 


al ?\ 

rtx d 


Ser 


Thr 


Ser 


Giy 


Pro 


Thr 


Pro 




X J u 










X <3 .J 










140 










Pro 


a 


Ser 


OXi I 


rtX d 


Leu 


A c n 


IT X 


fXl n 
ox 1 1 


Aia 


Gin 


Lys 


Gin 


Vai 


Giy 


Leu 












X ^ VJ 










155 










160 


Vai 


Thr 


o e r 


C7 Q »- 

o e r 


Pro 


MX a 


i n X 


oe X 


1 n 

oX 11. 


Thr 

1 1 1 X 


ox y 


Pro 


Gi y 


lie 


C y s 


Me t 










X U •J 










17 0 










175 




Asn 


A± a 


As n 


rrne 


As n 


oxn 


i n X 


MX 5 


Pro 


ox y 


Leu 


Leu 


As n 


Ser 


Asn 


O X 








180 










1 ft s 

X O J 










190 






Gl y 


HI S 


Ser 


Leu 


1 le I- 


As n 


ox 11 


al A 

MX d 


r;l n 

ox 1 1 


ox i 1 


Gi y 


Gin 


Aia 


Gin 


Val 


Met 




195 










9 0 0 










205 








As n 


(j-x y 


Ser 


Leu 


ox _y 


Al ;^ 


Al ;^ 
/~\X d 


Gi y 


^^r g 


Gi y 


Arg 


Gly 


Aia 


Giy 


Met 


Pro 




ii. X u 










<C X ^ 










22 0 










Tyr 


Pro 


M.JL a 


ir X u 


/-vx d 




ox 1 1 


r;i v 

ox y 


al ?\ 


Th r 


Ser 


Ser 


Vai 


Leu 


Aia 


Giu 


O O 










9 n 

^ o vj 










2 35 










2 4 0 


Thr 


Leu 


i n IT 


vjX 11 


V d X 


Ser 


Pro 


ni r\ 
OX 1 i 


1*1 C L- 


Al 3 

/AX d 


ox y 


Hie; 
nx > 


Aia 


Giy 


Leu 


As n 










? 4 S 










2 50 










2 55 




Thr 


AX a 


oxn 


M.X a 


ox y 


ox y 




1 IIX 


Lys 




Giy 


Met 


Thr 


Giy 


Thr 


Thr 








^ DU 










9 S 










27 0 






Ser 


Pro 


O K 

ir ne 


oxy 


oxn 


Pro 


irne 


Ser 


ril n 

ox 11 


Th r 

1 1 1 X 


r^i \/ 

ox y 


Gi y 


Gin 


Gin 


Met 


Gl y 






9 7 "^^ 








2 8 0 










285 








TV 1 ^ 

Al a 


i nr 


(jx y 


V a X 


Asn 


Pro 


ox 11 


Leu 


nl ;^ 


Ser 


Lys 


Gin 


Ser 


Met 


Vai 


As n 




<i y u 








9 Q S 










3 00 










Ser 


Leu 


Pro 


AX a 


rr ne 


Pro 


i. 1 1 X 


As p 


T 1 pi 

X X c 


ijy s 


As n 


Thr 


Ser 


Vai 


Thr 


Thr 


305 










X Lf 










315 










320 


V a X 


Pro 


Asn 


r^e TL 


o e X 


ox 11 


Leu 


ril n 
ox 1 1 


Th y 

1 11 X 


o X 


Vai 


Gi y 


lie 


Vai 


Pro 


Thr 










32 5 










330 










335 




Gin 


Axa 


X X e 


AX a 


i n X 


ox y 


Pro 


Th r- 
X IIX 


al 3 

>-vX d 


A c M 
r-via p 


Pro 


Giu 


Lys 


Arg 


Lys 


Leu 








4 n 

O T u 










34 5 










350 






lie 


Gin 


Cjxn 


oxn 


Leu 


A/3 1 
V a X 


Leu 


Leu 


Xitr >-i 


Hie: 
nx o 


A 1 


H i c; 
nx o 


Lys 


Cy s 


Gin 


Arg 






355 










O O V 










3 65 








Arg 


CjXU 


^x n 


M.X a 


Asn 


ox y 


r;i n 

ox Li 


1 

V d X 


A rrr 
i^x y 


Aia 


Cy s 


Ser 


Leu 


Pro 


Hi s 


Cys 


o T r\ 










7 S 










3 8 0 










Arg 


Thr 


ne u 


Lys 


As n 


V d X 


Leu 


> vo 1 1 


Hie: 
nx o 


1 It? L. 


Th r 

X 11 X 


His 


Cy s 


Gi n 


Aia 


Pro 


T Q Cl 
J O J 










O -7 u 










395 










4 0 0 


Lys 


AX a 


Cy s 


tjxn 


V ax 


A 1 a 

MX a 


nx o 


Cy s 


A 1 ^ 

MX d 


OCX 


OCX 


r\X y 


Gin 


T 1 

X X c 


T 1 *a 
X X c 


OCX 








fi u o 










4 1 0 

1 X w 










415 




ni s 


Trp 


Lys 


As n 


Cy s 


Thr 


Ar g 


Hi s 


Asp 


Cy s 


Pro 


Val 


Cy s 


Leu 


Pro 


Leu 








d 9 n 
*4 ^ \j 










425 










430 






Lys 


As n 


i-vx a 


Q o r* 
OCX 


Asp 


Lys 


Arg 


As n 


Gin 


Gin 


Thr 


lie 


Leu 


Giy 


Ser 


Pro 




4 s 

*i o ^ 










4 40 










445 








>\-L a 


Ser 


tjyx y 


X X e 


OX 1 1 




Th r 

X 1 1 X 


T 1 R 

X X c 


Giy 


Ser 


Vai 


Giy 


Ala 


Giy 


Gin 


Gin 




/I c; n 
H 5 U 








4 S S 
n -J ^ 










460 










Asn 


AX a 


i n j_ 


Ser 


Leu 


^ r* 

OCX 


As n 


Pro 


Asn 


Pro 


lie 


Asp 


Pro 


Ser 


Ser 


Met 


465 










470 










475 










480 


Gin 


Arg 


Aia 


Tyr 


Aia 


Ala 


Leu 


Gly 


Leu 


Pro 


Tyr 


Met 


Asn 


Gin 


Pro 


Gin 










485 










490 










495 




Thr 


Gin 


Leu 


Gin 


Pro 


Gin 


Vai 


Pro 


Giy 


Gin 


Gin 


Pro 


Aia 


Gin 


Pro 


Pro 








500 










505 










510 






Aia 


His 


Gin 


Gin 


Met 


Arg 


Thr 


Leu 


Asn 


Aia 


Leu 


Giy 


Ash 


Asn 


Pro 


Met 






515 










520 










525 








Ser 


Vai 


Pro 


Aia 


Gly 


Gly 


lie 


Thr 


Thr 


Asp 


Gin 


Gin 


Pro 


Pro 


Asn 


Leu 




530 










535 










540 










lie 


Ser 


Giu 


Ser 


Aia 


Leu 


Pro 


Thr 


Ser 


Leu 


Giy 


Aia 


Thr 


Asn 


Pro 


Leu 


545 










550 










555 










560 


Met 


Asn 


Asp 


Gly 


Ser 


Asn 


Ser 


Giy 


Asn 


lie 


Giy 


Ser 


Leu 


Ser 


Thr 


lie 



565 570 575 
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Pro 


Thr 


Ala 


Ala 


Pro 


Pro 


Ser 


Ser 


Thr 


Gly 


Val 


Arg 


Lys 


Gly 


Trp 


His 








580 










585 










590 






Glu 


His 


Val 


Thr 


Gin 


Asp 


Leu 


Arg 


Ser 


His 


Leu 


Val 


His 


Lys 


Leu 


Val 






595 










600 










605 








Gin 


Ala 


He 


Phe 


Pro 


Thr 


Pro 


Asp 


Pro 


Ala 


Ala 


Leu 


Lys 


Asp 


Arg 


Arg 




610 










615 










620 










Met 


Glu 


Asn 


Leu 


Val 


Ala 


Tyr 


Ala 


Lys 


Lys 


Val 


Glu 


Gly 


Asp 


Met 


Tyr. 


625 










630 










635 










640 


Glu 


Ser 


Ala 


Asn 


Ser 


Arg 


Asp 


Glu 


Tyr 


Tyr 


His 


Leu 


Leu 


Ala 


Glu 


Lys 










645 










650 










655 




lie 


Tyr 


Lys 


He 


Gin 


Lys 


Glu 


Leu 


Glu 


Glu 


Lys 


Arg 


Arg 


Thr 


Arg 


Leu 








660 










665 










670 






His 


Lys 


Gin 


Gly 


He 


Leu 


Gly 


Asn 


Gin 


Pro 


Ala 


Leu 


Pro 


Ala 


Ser 


Gly 




675 










68 0 










685 








Ala 


Gin 


Pro 


Pro 


Val 


He 


Pro 


Pro 


Ala 


Gin 


Ser 


Val 


Arg 


Pro 


Pro 


Asn 




690 










695 










700 










Giy 


Pro 


Leu 


Pro 


Leu 


Pro 


Val 


Asn 


Arg 


Met 


Gin 


Val 


Ser 


Gin 


Gly 


Met 


705 










710 










715 










720 


Asn 


Ser 


Phe 


Asn 


Pro 


Met 


Ser 


Leu 


Gly. 


Asn 


Val 


Gin 


Leu 


Pro 


Gin 


Ala 










725 










730 










735 




Pro 


Met 


Gly 


Pro 


Arg 


Ala 


Ala 


Ser 


Pro 


Met 


Asn 


His 


Ser 


Val 


Gin 


Met 








740 










745 










750 






Asn 


Ser 


Met 


Ala 


Ser 


Val 


Pro 


Gly 


Met 


Ala 


He 


Ser 


Pro 


Ser 


Arg 


Met 






755 










,760 










765 








Pro 


Gin 


Pro 


Pro 


Asn 


Met 


Met 


Gly 


Thr 


His 


Ala 


Asn 


Asn 


He 


Met 


Ala 




770 










775 










780 










Gin 


Ala 


Pro 


Thr 


Gin 


Asn 


Gin 


Phe 


Leu 


Pro 


Gin 


Asn 


Gin 


Phe 


Pro 


Ser 


785 










790 










795 










800 


Ser 


Ser 


Gly 


Ala 


Met 


Ser 


Val 


Asn 


Ser 


Val 


Gly 


Met 


Gly 


Gin 


Pro 


Ala 








805 










810 










815 




Ala 


Gin 


Ala 


Gly 


Val 


Ser 


Gin 


Gly 


Gin 


Glu 


Pro 


Gly 


Ala 


Ala 


Leu 


Pro 








820 










825 










830 






Asn 


Pro 


Leu 


Asn 


Met 


Leu 


Ala 


Pro 


Gin 


Ala 


Ser 


Gin 


Leu 


Pro 


Cys 


Pro 






835 










840 










845 








Pro 


Val 


Thr 


Gin 


Ser 


Pro 


Leu 


His 


Pro 


Thr 


Pro 


Pro 


Pro 


Ala 


Ser 


Thr 




850 










855 










860 










Ala 


Ala 


Gly 


Met 


Pro 


Ser 


Leu 


Gin 


His 


Pro 


Thr 


Ala 


Pro 


Gly 


Met 


Thr 


865 










870 










875 










880 


Pro 


Pro 


Gin 


Pro 


Ala 


Ala 


Pro 


Thr 


Gin 


Pro 


Ser 


Thr 


Pro 


Val 


Ser 


Ser 










885 










890 










895 




Gly 


Gin 


Thr 


Pro 


Thr 


Pro 


Thr 


Pro 


Gly 


Ser 


Val 


Pro 


Ser 


Ala 


Ala 


Gin 






900 










905 










910 






Thr 


Gin 


Ser 


Thr 


Pro 


Thr 


Val 


Gin 


Ala 


Ala 


Ala 


Gin 


Ala 


Gin 


Val 


Thr 






915 










920 










925 








Pro 


Gin 


Pro 


Gin 


Thr 


Pro 


Val 


Gin 


Pro 


Pro 


Ser 


Val 


Ala 


Thr 


Pro 


Gin 




930 










935 










940 










Ser 


Ser 


Gin 


Gin 


Gin 


Pro 


Thr 


Pro 


Val 


His 


Thr 


Gin 


Pro 


Pro 


Gly 


Thr 


945 










950 










955 










960 


Pro 


Leu 


Ser 


Gin 


Ala 


Ala 


Ala 


Ser 


He 


Asp 


Asn 


Arg 


Val 


Pro 


Thr 


Pro 










965 










970 










975 




Ser 


Thr 


Val 


Thr 


Ser 


Ala 


Glu 


Thr 


Ser 


Ser 


Gin 


Gin 


Pro 


Gly 


Pro 


Asp 








980 










985 










990 






Val 


Pro 


Met 


Leu 


Glu 


Met 


Lys 


Thr 


Glu 


Val 


Gin 


Thr 


Asp 


Asp 


Ala 


Glu 






995 








1000 








1005 








Pro 


Glu 


Pro 


Thr 


Glu 


Ser 


Lys 


Gly 


Glu 


Pro 


Arg 


Ser 


Glu 


Met 


Met 


Glu 


1010 










1015 








1020 










Glu 


Asp 


Leu 


Gin 


Gly 


Ser 


Ser 


Gin 


Val 


Lys 


Glu 


Glu 


Thr 


Asp 


Thr 


Thr 


025 








1030 










1035 








1040 


Glu 


Gin 


Lys 


Ser 


Glu 


Pro 


Met 


Gl\i 


Val 


Glu 


Glu 


Lys 


Lys 


Pro 


Glu 


Val 










1045 










1050 








1055 




Lys 


Val 


Glu 


Ala 


Lys 


Glu 


Glu 


Glu 


Glu 


Asn 


Ser 


Ser 


Asn 


Asp 


Thr 


Ala 



1060 1065 1070 
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Ser Gin Ser Thr Ser Pro Ser Gin Pro Arg Lys Lys lie Phe Lys Pro 

1075 1080 1085 

Glu Glu Leu Arg Gin Ala Leu Met Pro Thr Leu Glu Ala Leu Tyr Arg 

1090 1095 1100 

Gin Asp Pro Glu Ser Leu Pro Phe Arg Gin Pro Val Asp Pro Gin Leu 
105 1110 1115 1120 

Leu Gly He Pro Asp Tyr Phe Asp He Val Lys Asn Pro Met Asp Leu 

1125 1130 1135 

Ser Thr He Lys Arg Lys Leu Asp Thr Gly Gin Tyr Gin Glu Pro Trp 

1140 1145 1150 

Gin Tyr Val Asp Asp Val Arg Leu Met Phe Asn Asn Ala Trp Leu Tyr 

1155 1160 1165 

Asn Arg Lys Thr Ser Arg Val Tyr Lys Phe Cys Ser Lys Leu Ala Glu 

1170 1175 1180 

Val Phe Glu Gin Glu He Asp Pro Val Met Gin Ser Leu Gly Tyr Cys 
185 1190 1195 .1200 

Cys Gly Arg Lys Tyr Glu Phe Ser Pro Gin Thr Leu Cys Cys Tyr Gly 

1205 1210 1215 

Lys Gin Leu Cys Thr He Pro Arg Asp Ala Ala Tyr Tyr Ser Tyr Gin 

1220 1225 1230 

Asn Arg Tyr His Phe Cys Gly Lys Cys Phe Thr Glu He Gin Gly Glu 

1235 1240 1245 

Asn Val Thr Leu Gly Asp Asp Pro Ser Gin Pro Gin Thr Thr He Ser 

1250 1255 1260 

Lys Asp Gin Phe Glu Lys Lys Lys Asn Asp Thr Leu Asp Pro Glu Pro 
265 1270 1275 1280 

Phe Val Asp Cys Lys Glu Cys Gly Arg Lys Met His Gin He Cys Val 

1285 1290 1295 

Leu His Tyr Asp He He Trp Pro Ser Gly Phe Val Cys Asp Asn Cys 

1300 1305 . 1310 

Leu Lys Lys Thr Gly Arg Pro Arg Lys Glu Ash Lys Phe Ser Ala Lys 

1315 1320 1325 

Arg Leu Gin Thr Thr Arg Leu Gly Asn His Leu Glu Asp Arg Val Asn 

1330 1335 1340 

Lys Phe Leu Arg Arg Gin Asn His Pro Glu Ala Gly Glu Val Phe Val 
345 1350 1355 1360 

Arg Val Val Ala Ser Ser Asp Lys Thr Val Glu Val Lys Pro ily Met 

1365 1370 1375 

Lys Ser Arg Phe Val Asp Ser Gly Glu Met Ser Glu Ser Phe Pro Tyr 

1380 1385 1390 

Arg Thr Lys Ala Leu Phe Ala Phe Glu Glu He Asp Gly Val Asp Val 

1395 1400 1405 

Cys Phe Phe Gly Met His Val Gin Asp Thr Ala Leu He Ala Pro His 

1410 1415 1420 

Gin He Gin Gly Cys Val Tyr He Ser Tyr Leu Asp Ser He His Phe 
425 1430 1435 1440 

Phe Arg Pro Arg Cys Leu Arg Thr Ala Val Tyr His Glu He Leu He 

1445 1450 1455 

Gly Tyr Leu Glu Tyr Val Lys Lys Leu Val Tyr Val Thr Ala His He 

1460 1465 1470 

Trp Ala Cys Pro Pro Ser Glu Gly Asp Asp Tyr He Phe His Cys His 

1475 1480 1485 

Pro Pro Asp Gin Lys He Pro Lys Pro Lys Arg Leu Gin Glu Trp Tyr 

1490 1495 1500 

Lys Lys Met Leu Asp Lys Ala Phe Ala Glu Arg He He Asn Asp Tyr 
505 1510 1515 1520 

Lys Asp He Phe Lys Gin Ala Asn Glu Asp Arg Leu Thr Ser Ala Lys 

1525 1530 1535 

Glu Leu Pro Tyr Phe Glu Gly Asp Phe Trp Pro Asn Val Leu Glu Glu 

1540 1545 1550 

Ser He Lys Glu Leu Glu Gin Glu Glu Glu Glu Arg Lys Lys Glu Glu 
1555 1560 1565 
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Ser Thr Ala Ala Ser Glu Thr Pro Glu Gly Ser Gin Gly Asp Ser Lys 

1570 1575 1580 

Asn Ala Lys Lys Lys Asn Asn Lys Lys Thr Asn Lys Asn Lys Ser Ser 
585 1590 1595 1600 

lie Ser Arg Ala Asn Lys Lys Lys Pro Ser Met Pro Asn Val Ser Asn 

1605 1610 1615 

Asp Leu Ser Gin Lys Leu Tyr Ala Thr Met Glu Lys His Lys Glu Val 

1620 1625 1630 

Phe Phe Val lie His Leu His Ala Gly Pro Val lie Ser Thr Gin Pro 

1635 1640 1645 

Pro lie Val Asp Pro Asp Pro Leu Leu Ser Cys Asp Leu Met Asp Gly 

1650 1655 1660 

Arg Asp Ala Phe Leu Thr Leu Ala Arg Asp Lys His Trp Glu Phe Ser 
665 1670 1675 1680 

Ser Leu Arg Arg Ser Lys Trp Ser Thr Leu Cys Met Leu Val Glu Leu 

1685 ' 1690 1695 

His Thr Gin Gly Gin Asp Arg Phe Val Tyr Thr Cys Asn Glu Cys Lys 

1700 1705 1710 

His His Val Glu Thr Arg Trp His Cys Thr Val Cys Glu Asp Tyr Asp 

1715 1720 1725 

Leu Cys lie Asn Cys Tyr Asn Thr Lys Ser His Thr His Lys Met Val 

1730 1735 1740 

Lys Trp Gly Leu Gly Leu Asp Asp Glu Gly Ser Ser Gin Gly Glu Pro 
745 1750 1755 1760 

Gin Ser Lys Ser Pro Gin Glu Ser Arg Arg Leu Ser He Gin Arg Cys 

1765 1770 1775 

He Gin Ser Leu Val His Ala Cys Gin Cys Arg Asn Ala Asn Cys Ser 

1780 1785 1790 

Leu Pro Ser Cys Gin Lys Met Lys Arg Val Val Gin His Thr Lys Gly 

1795 1800 1805 

Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys Lys Gin Leu He 

1810 1815 1820 

Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu Asn Lys Cys Pro 
825 1830 1835 1840 

Val Pro Phe Cys Leu Asn He Lys His Asn Val Arg Gin Gin Gin He 

1845 1850 1355 

Gin His Cys Leu Gin Gin Ala Gin Leu Met Arg Arg Arg Met Ala Thr 

'i860 1865 1870 

Met Asn Thr Arg Asn Val Pro Gin Gin Ser Leu Pro Ser Pro Thr Ser 

1875 1880 1885 

Ala Pro Pro Gly Thr Pro Thr Gin Gin Pro Ser Thr Pro Gin Thr Pro 

1890 1895 1900 

Gin Pro Pro Ala Gin Pro Gin Pro Ser Pro Val Asn Met Ser Pro Ala 
905 1910 1915 1920 

Gly Phe Pro Asn Val Ala Arg Thr Gin Pro Pro Thr He Val Ser Ala 

1925 1930 1935 

Gly Lys Pro Thr Asn Gin Val Pro Ala Pro Pro Pro Pro Ala Gin Pro 

1940 1945 - 1950 

Pro Pro Ala Ala Val Glu Ala Ala Arg Gin He Glu Arg Glu Ala Gin 

1955 1960 1965 

Gin Gin Gin His Leu Tyr Arg Ala Asn He Asn Asn Gly Met Pro Pro 

1970 1975 1980 

Gly Arg Asp Gly Met Gly Thr Pro Gly Ser Gin Met Thr Pro Val Gly 
985 1990 1995 2000 

Leu Asn Val Pro Arg Pro Asn Gin Val Ser Gly Pro Val Met Ser Ser 

2005 2010 2015 

Met Pro Pro Gly Gin Trp Gin Gin Ala Pro He Pro Gin Gin Gin Pro 

2020 2025 2030 

Met Pro Gly Met Pro Arg Pro Val Met Ser Met Gin Ala Gin Ala Ala 

2035 2040 2045 

Val Ala Gly Pro Arg Met Pro Asn Val Gin Pro Asn Arg Ser He Ser 
2050 2055 2060 
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Pro Ser Ala Leu Gin Asp Leu Leu Arg Thr Leu Lys Ser Pro Ser Ser 
065 2070 2075 2080 

Pro Gin Gin Gin Gin Gin Val Leu Asn lie Leu Lys Ser Asn Pro Gin 

2085 2090 2095 

Leu Met Ala Ala Phe He Lys Gin Arg Thr Ala Lys Tyr Val Ala Asn 

2100 2105 2110 

Gin Pro Gly Met Gin Pro Gin Pro Gly Leu Gin Ser Gin' Pro Gly Met 

2115 2120 2125 

Gin Pro Gin Pro Gly Met His Gin Gin Pro Ser Leu Gin Asn Leu Asn 

2130 2135 2140 

Ala Met Gin Ala Gly Val Pro Arg Pro Gly Val Pro Pro Pro Gin Pro 
145 2150 2155 2160 

Ala Met Gly Gly Leu Asn Pro Gin Gly Gin Ala Leu Asn He Met Asn 

2165 2170 2175 

Pro Gly His Asn Pro Asn Met Thr Asn Met Asn Pro Gin Tyr Arg Glu 

2180 2185 2190 

Met Val Arg Arg Gin Leu Leu Gin His Gin Gin Gin Gin Gin Gin Gin 

2195 2200 2205 

Gin Gin Gin Gin Gin Gin Gin Gin Asn Ser Ala Ser Leu Ala Gly Gly 

2210 .2215 2220 

Met Ala Gly His Ser Gin Phe Gin Gin Pro Gin Gly Pro Gly Gly Tyr 
225 2230 2235 2240 

Ala Pro Ala Met Gin Gin Gin Arg Met Gin Gin His Leu Pro He Gin 

2245 2250 2255 

Gly Ser Ser Met Gly Gin Met Ala Ala Pro Met Gly Gin Leu Gly Gin 

2260 2265 2270 

Met Gly Gin Pro Gly Leu Gly Ala Asp Ser Thr Pro Asn He Gin Gin 

2275 2280 2285 

Ala Leu Gin Gin Arg He Leu Gin Gin Gin Gin Met Lys Gin Gin I Le 

2290 2295 2300 

Gly Ser Pro Gly Gin Pro Asn Pro Met Ser Pro Gin Gin His Met Leu 
305 2310 2315 2320 

Ser Gly Gin Pro Gin Ala Ser His Leu Pro Gly Gin Gin He Ala Thr 

2325 2330 2335 

Ser Leu Ser Asn Gin Val Arg Ser Pro Ala Pro Val Gin Ser Pro Arg 

2340 2345 2350 

Pro Gin Ser Gin Pro Pro His Ser Ser Pro Ser Pro Arg He Gin Pro 

2355 2360 2365 

Gin Pro Ser Pro His His Val Ser Pro Gin Thr Gly Thr Pro His Pro 

2370 2375 2380 

Gly Leu Ala Val Thr Met Ala Ser Ser Met Asp Gin Gly His Leu Gly 
385 2390 2395 2400 

Asn Pro Glu Gin Ser Ala Met Leu Pro Gin Leu Asn Thr Pro Asn Arg 

2405 2410 2415 

Ser Ala Leu Ser Ser Glu Leu Ser Leu Val Gly Asp Thr Thr Gly Asp 

2420 2425 2430 

Thr Leu Glu Lys Phe Val Glu Gly Leu 
2435 2440 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: None 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



Met Ala Glu Ala Gly Gly Ala Gly Ser Pro Ala Leu Pro Pro Ala Pro 
1 5 10 15 
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Pro 


His 


Gly 


Ser 


Pro 


Arg 


Thr 


Leu 


Ala 


Thr 


Ala 


Ala 


Gly 


Ser 


Ser 


Ala 






20 










25 










3 0 






Ser 


Cys 


Gly 


Pro 


Ala 


Thr 


Pro 


Val 


Ala 


Ala 


Ala 


Gly 


Thr 


Ala 


Glu 


Gly 




35 










40 










45 








Pro 


Gly 


Gly 


Gly 


Gly 


Ser 


Ala 


Arg 


He 


Ala 


Val 


Lys 


Lys 


Ala 


Gin 


Leu 




50 










55 










60 










Arg 


Ser 


Ala 


Pro 


Arg 


Ala 


Lys 


Lys 


Leu 


Glu 


Lys 


Leu 


Gly 


Val 


Tyr 


Ser 


65 










70 










75 










8 0 


Ala 


Cys 


Lys 


Ala 


Glu 


Glu 


Ser 


Cys 


Lys 


Cys 


Asn 


Gly 


Trp 


Lys 


Asn 


Pro 






85 










90 










95 




Asn 


Pro 


Ser 


Pro 


Thr 


Pro 


Pro 


Arg 


Gly 


Asp 


Leu 


Gin 


Gin 


He 


He 


Val 








100 










105 










110 






Ser 


Leu 


Thr 


Glu 


Ser 


Cys 


Arg 


Ser 


Cys 


Ser 


His 


Ala 


Leu 


Ala 


Ala 


His 






115 










120 










125 








Val 


Ser 


His 


Leu 


Glu 


Asn 


Val 


Ser 


Glu 


Glu 


Glu 


Met 


Asp 


Arg 


Leu 


Leu 




130 










135 










140 










Gly 


He 


Val 


Leu 


Asp 


Val 


Glu 


Tyr 


Leu 


Phe 


Thr 


Cys 


Val 


His 


Lys 


Glu 


145 










150 










155 










160 


Glu 


Asp 


Ala 


Asp 


Thr 


Lys 


Gin 


Val 


Tyr 


Phe 


Tyr 


Leu 


Phe 


Lys 


Leu 


Leu 








165 










170 










175 




Arg 


Lys 


Ser 


He 


Leu 


Gin 


Arg 


Gly 


Lys 


Pro 


Val 


Val 


Glu 


Gly 


Ser 


Leu 




180 










185 










190 






Glu 


Lys 


Lys 


Pro 


Pro 


Phe 


Glu 


Lys 


Pro 


Ser 


He 


Glu 


Gin 


Gly 


Val 


Asn 




195 










200 










205 








Asn 


Phe 


Val 


Gin 


Tyr 


Lys' 


Phe 


Ser 


His 


Leu 


Pro 


Ser 


Lys 


Glu 


Arg 


Gin 




210 








215 










220 










Thr 


Thr 


lie 


Glu 


Leu 


Ala 


Lys 


Met 


Phe 


Leu 


Asn 


Arg 


He 


Asn 


Tyr 


Trp 


225 










230 










235 










240 


His 


Leu 


Glu 


Ala 


Pro 


Ser 


Gin 


Acq 


Arg 


Leu 


Arg 


Ser 


Pro 


Asn 


Asp 


Asp 










245 










250 










255 




lie 


Ser 


Gly 


Tyr 


Lys 


Glu 


Asn 


Tyr 


Thr 


Arg 


Trp 


Leu 


Cys 


Tyr 


Cys 


Asn 






260 










265 










270 






Val 


Pro 


Gin 


Phe 


Cys 


Asp 


Ser 


Leu 


Pro 


Arg 


Tyr 


Glu 


Thr 


Thr 


Lys 


Val 






275 










280 










285 








Phe 


Gly 


Arg 


Thr 


Leu 


Leu 


Arg 


Ser 


Val 


Phe 


Thr 


He 


Met 


Arg 


Arg 


Gin 




290 










295 










300 










Leu 


LeLuGlu 


Gin 


Ala 


Arg 


Gin 


Lys 


Lys 


Asp 


Lys 


Leu 


Pro 


Leu 


Glu 


Lys 


305 










310 










315 










320 


Arg 


Thr 


Leu 


He 


Leu 


Thr 


His 


Phe 


Pro 


Lys 


Phe 


Leu 


Ser 


Met 


Leu 


Glu 








325 










330 










335 




Glu 


Glu 


Val 


Tyr 


Ser 


Gin 


Asn 


Ser 


Pro 


He 


Trp 


Asp 


Gin 


Asp 


Phe. 


Leu 








340 










345 










350 






Ser 


Ala 


Ser 


Ser 


Arg 


Thr- 


Ser 


Pro 


Leu 


Gly 


He 


Gin 


Thr 


Val 


He 


Ser 






355 










360 










365 








Pro 


Pro 


Val 


Thr 


Gly 


Thr 


Ala 


Leu 


Phe 


Ser 


Ser 


Asn 


Ser 


Thr 


Ser 


His 




370 








375 










380 










Glu 


Gin 


He 


Asn 


Gly 


Gly 


Arg 


Thr 


Ser 


Pro 


Gly 


Cys 


Arg 


Gly 


Ser 


'Ser 


385 










390 










395 










400 


Gly 


Leu 


Glu 


Ala 


Asn 


Pro 


Gly 


Glu 


Lys 


Arg 


Lys 


Met 


Asn 


Asn 


Ser 


His 








405 










410 










415 




Ala 


Pro 


Glu 


Glu 


Ala 


Lys 


Arg 


Ser 


Arg 


Val 


Met 


Gly 


Asp 


He 


Pro 


Val 








420 










425 










430 






Glu 


Leu 


He 


Asn 


Glu 


Val 


Met 


Ser 


Thr 


He 


Thr 


Asp 


Pro 


Ala 


Gly 


Met 






435 










440 










445 








Leu 


Gly 


Pro 


Glu 


Thr 


Asn 


Phe 


Leu 


Ser 


Ala 


His 


Ser 


Ala 


Arg 


Asp 


Glu 




450 










455 










460 










Ala 


Ala 


Arg 


Leu 


Glu 


Glu 


Arg 


Arg 


Gly 


Val 


He 


Glu 


Phe 


His 


Val 


Val 


465 








470 










475 










480 


Gly 


Asn 


Ser 


Leu 


Asn 


Gin 


Lys 


Pro 


Asn 


Lys 


Lys 


He 


Leu 


Met 


Trp 


Leu 








485 










490 










495 




Val 


Gly 


Leu 


Gin 


Asn 


Val 


Phe 


Ser 


His 


Gin 


Leu 


Pro 


Arg 


Met 


Pro 


Lys 






500 










505 










510 
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Glu 


Tvr 


lie 


Thr 


Arq 


Leu 


Val 


Phe 


Asp 


Pro 


Lys 


His 


Lys 


Thr 


Leu 


Ala 




515 










520 










525 










Ile 


Lys 


Asp 


Glv 


Arg 


Val 


lie 


Gly 


Gly 


He 


Cys 


Phe 


Arg 


Met 


Phe 




530 








535 










540 










Pjto 


S e r 


Gin 


Glv 


Phe 


Thr 


Glu 


He 


Val 


Phe 


Cys 


Ala 


Val 


Thr 


Ser 


Asn 


545 








550 










555 










560 


Glu 


Gin 


Val 


Lvs 


Gly 


Tyr 


Gly 


Thr 


His 


Leu 


Met 


Asn 


His 


Leu 


Lys 


Glu 








565 










570 










575 




i ys- 


His 


lie 


Lys 


His 


Glu 


He 


Leu 


Asn 


Phe 


Leu 


Thr 


Tyr 


Ala 


Asp 


Glu 






580 










585 










590 






i y I. 


Ala 


lie 


Glv 


Tvr 

J. y ^ 


Phe 


Lvs 


Lvs 


Gin 


Gly 


Phe 


Ser 


Lys 


Glu 


He 


Lys 




595 










600 










605 








lie 


Pro 


Lys 


Thr 


Lvs 


Tyr 


Val 


Gly 


Tyr 


He 


Lys 


Asp 


Tyr 


Glu 


Gly 


Ala 




610 








615 










620 










Thr 


Leu 


Met 


Glv 

J, Jf 


Cvs 


Glu 


Leu 


Asn 


Pro 


Gin 


He 


Pro 


Tyr 


Thr 


Glu 


Phe 


625 






630 










635 










640 


Se r 


Val 


lie 


lie 


Lvs 


Lvs 


Gin 


Lvs 


Glu 


He 


He 


Lys 


Lys 


Leu 


He 


Glu 










645 










650 










655 






Ly s 


Gin 


Ala 


Gin 


He 


Arg 


Lvs 


Val 


Tvr 


Pro 


Gly 


Leu 


Ser 


Cys 


Phe 




660 










665 










670 








Asp 


Gl V 


Val 


Arc 


Gin 


lie 


Pro 


He 


Glu 


Ser 


He 


Pro 


Gly 


He 


Arg 


675 










680 










685 








Glu 


Thr 


Gl V 


Trp 


Lvs 


Pro 


Ser 


Gly 


Lys 


Glu 


Lys 


Ser 


Lys 


Glu 


Pro 


Lys 




690 










695 










700 










Asp 


Pro 


Glu 


His 


Val 


Tvr 


Ser 


Thr 


Leu 


Lys 


Asn 


He 


Leu 


Gin 


Gin 


Val 


705 










710 










715 










720 


Ly s 


Asn 


His 


Pro 


Asn 


Ala 


Trp 


Pro 


Phe 


Met 


Glu 


Pro 


Val 


Lys 


Arg 


Thr 








725 










730 










735 






Ala 


Pro 


Gly 


Tvr 


Tvr 


Glu 


Val 


He 


Arg 


Phe 


Pro 


Met 


Asp 


Leu 


Lys 








740 










745 










750 






Thr 


Met 


Ser 


Glu. 


Arg 


Leu 


Arg 


Asn 


Arg 


Tyr 


Tyr ■ 


Val, 


Ser 


Lys 


Lys 


Leu 






755 










760 










765. 








Phe 


Met 


Ala 


Asp 


Leu 


Gin 


Arg 


Val 


Phe 


Thr 


Asn 


Cys 


Lys 


Glu 


Tyr 


Asn 




770 








775 










780 










Pro 


Pro 


Glu 


Ser 


Glu 


Tyr 


Tyr 


Lys 


Cys 


Ala 


Ser 


He 


Leu 


Glu 


Lys 


Phe 


785 










790 










795 










800 


Phe 


PhewSer 


Lys 


He 


Lys 


Glu 


Ala 


Gly 


Leu 


He 


Asp 


Lys 
















805 










810 















(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE. CHARACTERISTICS : 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE. DESCRIPTION: SEQ ID NO : 9': 

His Thr Lys Gly Cys Lys Arg Lys Thr Asn Gly Gly Cys Pro Val Cys 

15 10 15 

Lys Gin Leu He Ala Leu Cys Cys Tyr His Ala Lys His Cys Gin Glu 

20 ■ 25 30 

Asn Lys Cys Pro Val Pro Phe Cys Leu Asn He Lys His Asn Val Arg 
35 40 45 ' 

Gin Gin 
50 

"(2)' INFORMATION FOR SEQ ID NO: 10: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 2204 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) 



SEQUENCE DESCRIPTION: 



SEQ ID NO: 10 : 



ACCCACTCCC CCCAGAGCCG ACCTGCAGCA AATAATTGTC AGTCTAACAG AATCCTGTCG 

GAGTTGTAGC CATGCCCTAG CTGCTCATGT TTCCCACCTG GAGAATGTGT CAGAGGAAGA 

AATGAACAGA CTCCTGGGAA TAGTATTGGA TGTGGAATAT CTCTTTACCT GTGTCCACAA 

GGAAGAAGAT GCAGATACCA AACAAGTTTA TTTCTATCTA TTTAAGCTCT TGAGAAAGTC 

TATTTTACAA AGAGGAAAAC CTGTGGTTGG AAGGCTCTTT GGAAAAGAAA CCCCCATTTG 
AAAAACCTAG CATTGAACAG GGTGTGAATA ACTTTGTGCA GTACAAATTT AGTCACCTGC 

CAGCAAAAAG AAAGGCAAAC CAATAGTTGA GTTGGCAAAA ATGTTCCTAA ACCGCATCAC 

CTATTGGCAT CTGGAGGCAC CATCTCAACG AGACTGCGAT CTCCAATGAT GATATTCTGG 

ATACAAAGAG AACTACACAA GGTGGCTGTG TTACTGCAAC GTGCCACAGT TCTGCGACAG 

TCTACCTCGG TACGAAACCA CACAGGTGTT TGGGAGAACA TCGTTCGCTC GGTCTTCACT 

GTTATGAGGC GACAACTCCT GGAACAAGCA AGACAGGAAA AAGATAAACT GCCTCTTGAA 

AAACGAACTC TAATCCTCAC TCATTTCCCA AAATTTCTGT CCATGCTAGA AGAAGAAGTA 

TATAGTCAAA ACTCTCCCAT CTGGGATCAC CATTTTCTCT CAGCCTCTTC CAGAACCAGC 

CAGCTAGGCA TCCAAACAGT TATCAATCAC CTCCTGTGGC TGGGACAATT TCATACAATT 

CAACCTCATC TTCCCTTGAG CAGCCAAACG CAGGGAGCAG CAGTCCTGCC TGCAAAGCCT 

CTTCTGGACT TGAGGCAAAC CCAGGAGAAA AGAGGAAAAT GACTGATTCT CATGTTCTGG 

AGGAGGCCAA GAAACCCCGA GTTATGGGGG ATATTCCGAT GGAATTAATC AACGAGGTTA 

TGTCTACCAT CACGGACCCT GCAGCAATGC TTGGACCAGA GACCAATTTT CTGTCAGCAC 

ACTCGGCCAG GGATGAGGCG GCAAGGTTGG AAGAGCGCAG GGGTGTAATT GAATTTCACG 

TGGTTGGCAA TTCCCTCAAC CAGAAACCAA ACAAGAAGAT CCTGATGTGG CTGGTTGGCC 

TACAGAACGT TTTCTCCCAC CAGCTGCCCC GAATGCCAAA AGAATACATC ACACGGCTCG 

TCTTTGACCC GAAACACAAA ACCCTTGCTT TAATTAAAGA TGGCCGTGTT ATTGGTGGTA 

TCTGTTTCCG TATGTTCCCA TCTCAAGGAT TCACAGAGAT TGTCTTCTGT GCTGTAACCT 

CAAATGAGCA AGTCAAGGGC TATGGAACAC ACCTGATGAA TCATTTGAAA GAATATCACA 

TAAAGCATGA CATCCTGAAC TTCCTCACAT ATGCAGATGA ATATGCAATT GGATACTTTA 

AGAAACAGGG TTTCTCCAAA GAAATTAAAA TACCTAAAAC CAAATATGTT GGCTATATCA 

AGGATTATGA AGGAGCCACT TTAATGGGAT GTGAGCTAAA TCCACGGATC CCGTACACAG 

AATTTTCTGT CATCATTAAA AAGCAGAAGG AGATAATTAA AAAACTGATT GAAAGAAAAC 

AGGCACAAAT TCGAA7\AGTT TACCCTGGAC TTTCATGTTT TAAAGATGGA GTTCGACAGA 

TTCCTATAGA AAGCATTCCT GGAATTAGAG AGACAGGCTG GAAACCGAGT GGAAAAGAGA 

AAAGTAAAGA GCCCAGAGAC CCTGACCAGC TTTACAGCAC GCTCAAGAGC ATCCTCCAGC 

AGGTGAAGAG CCATCAAAGC GCTTGGCCCT TCATGGAACC TGTGAAGAGA ACAGAAGCTC 

CAGGATATTA TGAAGTTATA AGGTCCCCCA TGGATCTCAA AACCATGAGT GAACGCCTCA 

AGAATAGGTA CTACGTGTCT AAGAAATTAT TCATGGCAGA CTTACAGCGA GTCTTTACCA 

ATTGCAAAGA GTACAACGCC CCTGAGAGTG AATACTACAA ATGTGCCAAT ATCCTGGAGA 

AATTCTTCTT CAGTAAAATT AAGGAAGCTG GATTAATTGA C7\AGTGATTT TTTTTCCCCC 

TCTGCTTCTT AGAAACTCAC CAAGCAGTGT GCCTAAAGCA AGGT 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2093 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

GAATTCCGGC GAAACCACTC ATGTCTTTGG GCGAAGCCTT CTCCGGTCCA TTTTCACCGT 
TACCCGCCGG CAGCTGCTGG AAAAGTTCCG AGTGGAGAAG GACAAATTGG TGCCCGAGAA 
GAGGACCCTC ATCCTCACTC ACTTCCCCAA GTAAGGCTCC TTCTGGCCTA CCAGGATTTG 
GCCCCAAGTT CACATCCTCC CTGTTGTCCC CTTTTTTCCA GGAAGGCTTC CTGGATTGGT 
CCCTCCTCTC CCTCCATGGG CCTTTTGGGA TCTGGGCGTC TACCTGGCAG ACTTGCCCAT 
GGCCCAGAAG CAACTTGCTA GTACTAGTCT GGGGATGGCA GATTCCTGTC CATGCTGGAG 
GAGGAGATCT ATGGGGCAAA CTCTCCAATC TGGGAGTCAG GCTTCACCAT GCCACCCTCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
■720 
,780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
186G 
1920 
1980 
2040 
2100 
2160 
2204 



60 
120 
180 
240 
300 
360 
420 
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GAGGGGACAC AGCTGGTTCC CCGGCCAGCT TCAGTCAGTG CAGCGGTTGT TCCCAGCACC 4 80 

CCCATCTTCA GCCCCAGCAT GGGTGGGGGC AGCAACAGCT CCCTGAGTCT GGATTCTGCA 54 0 

GGGGCCGAGC CTATGCCAGG CGAGAAGAGG ACGCTCCCAG AGAACCTGAC CCTGGAGGAT 6 00 

GCCAAGCGGC TCCGTGTGAT GGGTGACATC CCCATGGAGC TGGTCAATGA GGTCATGCTG 660 

ACCATCACTG ACCCTGCTGC CATGCTGGGG CCTGAGACGA GCCTGCTTTC GGCCAATGCG 72 0 

GCCCGGGATG AGACAGCCCG CCTGGAGGAG CGCCGCGGCA TCATCGAGTT CCATGTCATC 7 80 

GGCAACTCAC TGACGCCCAA GGCCAACCGG CGGGTGTTGC TGTGGCTCGT GGGGCTGCAG 8 40 

AATGTCTTTT CCCACCAGCT GCCGCGCATG CCTAAGGAGT ATATCGCCCG CCTCGTCTTT 900 

GACCCGAAGC ACAAGACTCT GGCCTTGATC AAGGATGGGC GGGTCATCGG TGGCATCTGC 9 60 

TTCCGCATGT TTCCCACCCA GGGCTTCACG GAGATTGTCT TCTGTGCTGT CACCTCGAAT 102 0 

GAGCAGGTCA AGGGTTATGG GACCCACCTG ATGAACCACC TGAAGGAGTA TCACATCAAG 108 0 

CACAACATTC TCTACTTCCT CACCTACGCC GACGAGTACG CCATCGGCTA CTTCAAAAAG 114 0 

CAGGGTTTCT CCAAGGACAT CAAGGTGCCC AAGAGCCGCT ACCTGGGCTA CATCAAGGAC 12 00 

TACGAGGGAG CGACGCTGAT GGAGTGTGAG CTGAATCCCC GCATCCCCTA CACGGAGCTG 12 60 

TCCCACATCA TCAAGAAGCA GAAAGAGATC ATCAAGAAGC TGATTGAGCG CAAACAGGCC 132 0 

CAGATCCGCA AGGTCTACCC GGGGCTCAGC TGCTTCAAGG AGGGCGTGAG GCAGATCCCT 13 80 

GTGGAGAGCG TTCCTGGCAT TCGAGAGACA GGCTGGAAGC CATTGGGGAA GGAGAAGGGG 144 0 

AAGGAGCTGA AGGACCCCGA CCAGCTCTAC ACAACCCTCA AAAACCTGCT GGCCCAAATC 1500 

AAGTCTCACC CCAGTGCCTG GCCCTTCATG GAGCCTGTGA AGAAGTCGGA GGCCCCTGAC • 1560 

TACTACGAGG TCATCCGCTT CCCCATTGAC CTGAAGACCA TGACTGAGCG GCTGCGAAGC 162 0 

CGCTACTACG TGACCCGGAA GCTCTTTGTG GCCGACCTGC AGCGGGTCAT CGCCAACTGT 168 0 

CGCGAGTACA ACCCCCCGGA CAGCGAGTAC TGCCGCTGTG CCAGCGCCCT GGAGAAGTTC 17'4 0 

TTCTACTTCA AGCTCAAGGA GGGAGGCCTC ATTGACAAGT AGGCCCATCT TTGGGCCGCA 18 00 

GCCCTGACCT GGAATGTCTC CACCTCGGAT TCTGATCTGA TCCTTAGGGG GTGCCCTGGC 18 60 

CCCACGGACC CGACTCAGCT TGAGACACTC CAGCCAAGGG TCCTCCGGAC CCGATCCTGC 192 0 

AGCTCTTTCT GGACCTTCAG GCACCCCCAA GCGTGCAGCT CTGTCCCAGC CTTCACTGTG 19 8 0 

TGTGAGAGGT CTCCTGGGTT GGGGCCGAGC CCCTCTAGAG TAGCTGGTGG CCAGGGATGA 2 04 0 

ACCTTGCCCA GCCGTGGTGG CCCCCAGGCC TGGTCCCCAA GAGCCCGGAA TTC 2093 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: ' 

(A) LENGTH: 9046 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

CCTTGTTTGT GTGCTAGGCT GGGGGGGAGA GAGGGCGAGA GAGAGCGGGC GAGAGTGGGC 60 

AAGCAGGACG CCGGGCTGAG TGCTAACTGC GGGACGCAGA GAGTGCGGAG GGGAGTCGGG 12 0 

TCGGAGAGAG GCGGCAGGGG CCAGAACAGT GGCAGGGGGC CCGGGGCGCA CGGGCTGAGG 18 0 

CGACC'CCCAG CCCCCTCCCG TCCGCACACA CCCCCACCGC GGTCCAGCAG CCGGGCCGGC 2 40 

GTCGACGCTA GGGGGGACCA TTACATAACC CGCGCCCCGG CCGTCTTCTC CCGCCGCCGC 3 00 

GGCGCCCGAA CTGAGCCCGG GGCGGGCGCT CCAGCACTGG CCGCCGGCGT GGGGCGTAGC 3 60 

AGCGGCCGTA TTATTATTTC GCGGAAAGGA AGGCGAAGGA GGGGAGCGCC GGCGCGAGGA 42 0 

GGGGCCGCCt GCGCCCGCCG CCGGAGCGGG GCCTCCTCGG TGGGCTCCGC GTCGGCGCGG 48 0 

GCGTGCGGGC GGCGCTGCTC GGCCCGGCCC CCTCGGCCCT CTGGTCCGGC CAGCTCCGCT 5 40 

CCCGGCGTCC TTGCCGCGCC TCCGCCGGCC GCCGCGCGAT GTGAGGCGGC GGCGCCAGCC 600 

TGGCTCTCGG CTCGGGCGAG TTCTCTGCGG CCATTAGGGG CCGGTGCGGC GGCGGCGCGG 660 

AGCGCGGCGG CAGGAGGAGG GTTCGGAGGG TGGGGGCGCA GGCCCGGGAG GGGGCACCGG 72 0 

GAGGAGGTGA GTGTCTCTTG TCGCCTCCTC CTCTCCCCCC TTTTCGCCCC CGCCTCCTTG 7 80 

TGGCGATGAG AAGGAGGAGG ACAGCGCCGA GGAGGAAGAG GTTGATGGCG GCGGCGGAGC 8 40 

TCCGAGAGAC. CTCGGCTGGG CAGGGGCCGG CCGTGGCGGG CCGGGGACTG CGCCTCTAGA ' 900 

GCCGCGAGTT CTCGGGAATT CGCCGCAGCG GACCGGCCTC GGCGAATTTG TGCTCTTGTG 9 60 

CCCTCCTCCG GGCTTGGGCC AGGCCGGCCC CTCGCACTTG CCCTTACCTT TTCTATCGAG 102 0 

TCCGCATCCC TCTCCAGCCA CTGCGACCCG GCGAAGAGAA AAAGGAACTT C'CCCCACCCC 10 8 0 

CTCGGGTGCC GTCGGAGCCC CCCAGCCCAC CCCTGGGTGC GGCGCGGGGA CCCCGGGCCG 1140 

AAGAAGAGAT TTCCTGAGGA TTCTGGTTTT CCTCGCTTGT ATCTCCGAAA GAATTAAAAA 12 00 

TGGCCGAGAA TGTGGTGGAA CCGGGGCCGC CTTCAGCCAA GCGGCCTAAA CTCTCATCTC 12 60 

CGGCCCTCTC GGCGTCCGCC AGCGATGGCA CAGATTTTGG CTCTCTATTT GACTTGGAGC 13 20 

ACGACTTAC'C AGATGAATTA ATCAACTCTA CAGAATTGGG ACTAACCAAT GGTGGTGATA 138 0 
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TTAATCAGCT TCAGACAAGT CTTGGCATGG TACAAGATGC AGCTTCTAAA CATAAACAGC 14 4 0 

TGTCAGAATT GCTGCGATCT GGTAGTTCCC CTAACCTCAA TATGGGAGTT GGTGGCCCAG 1500 

GTCAAGTCAT GGCCAGCCAG GCCCAACAGA GCAGTCCTGG ATTAGGTTTG ATAAATAGCA 15 60 

TGGTCAAAAG CCCAATGACA CAGGCAGGCT TGACTTCTCC CAACATGGGG ATGGGCACTA 162 0 

GTGGACCAAA TCAGGGTCCT ACGCAGTCAA CAGGTATGAT GAACAGTCCA GTAAATCAGC 168 0 

CTGCCATGGG AATGAACACA GGGACGAATG CGGGCATGAA TCCTGGAATG TTGGCTGCAG . 17 4 0 

GCAATGGACA AGGGATAATG CCTAATCAAG TCATGAACGG TTCAATTGGA GCAGGCCGAG 18 00 

GGCGACAGGA TATGCAGTAC CCAAACCCAG GCATGGGAAG TGCTGGCAAC TTACTGACTG 18 60 

AGCCTCTTCA GCAGGGCTCT CCCCAGATGG GAGGACAAAC AGGATTGAGA GGCCCCCAGC 1920 

CTCTTAAGAT GGGAATGATG AACAAGCCCA ATCCTTATGG TTCACCATAT ACTCAGAATC 198 0 

CTGGACAGCA GATTGGAGCC AGTGGCCTTG GTCTCCAGAT TCAGACAAAA ACTGTACTAT 2 0 40 

CAAATAACTT ATCTCCATTT GCTATGGACA AAAAGGCAGT TCCTGGTGGA GGAATGCCCA 2100 

ACATGGGTCA ACAGCCAGCC CCGCAGGTCC AGCAGCCAGG TCTGGTGACT CCAGTTGCCC 2160 

AAGGGATGGG TTCTGGAGCA CATACAGCTG ATCCAGAGAA GCGCAAGCTC ATCCAGCAGC 2220 

AGCTTGTTCT CCTTTTGCAT GCTCACAAGT GCCAGCGCCG GGAACAGGCC AATGGGGAAG 22 8 0 

TGAGGCAGTG CAACCTTCCC CACTGTCGCA CAATGAAGAA TGTCCTAAAC CACATGACAC 2340 

ACTGCCAGTC AGGCAAGTCT TGCCAAGTGG CACACTGTGC ATCTTCTCGA CAAATCATTT 24 00 

CACACTGGAA GAATTGTACA AGACATGATT GTCCTGTGTG TCTCCCCCTC AAAAATGCTG 2 4 60 

GTGATAAGAG AAATCAACAG CCAATTTTGA CTGGAGCACC CGTTGGACTT GGAAATCCTA '252 0 

GCTCTCTAGG GGTGGGTCAA CAGTCTGCCC CCAACCTAAG CACTGTTAGT CAGATTGATC 2580 

CCAGCTCCAT AGAAAGAGCC TATGCAGCTC TTGGACTACC CTATCAAGTA AATCAGATGC 264 0 

CGACACAACC CCAGGTGCAA GCAAAGAACC AGCAGAATCA GCAGCCTGGG CAGTCTGCCC 27 00 

AAGGCATGCG GCCCATGAGC AACATGAGTG CTAGTCCTAT GGGAGTAAAT GGAGGTGTAG 2 7 60 

GAGTTCAAAC GCCGAGTCTT CTTTCTGACT CAATGTTGCA TTCAGCCATA AATTCTCAAA ' 282 0 

ACCCAATGAT GAGTGAAAAT GCCAGTGTGC CCTCCCTGGG TCCTATGCCA ACAGCAGCTC 288 0 

AACCATCCAC TACTGGAATT CGGAAACAGT GGCACGAAGA TATTACTCAG GATCTTCGA;^. 2 94 0 

ATCATCTTGT TCACAAACTC GTCCAAGCCA TATTTCCTAC GCCGGATCCT GCTGCTTTA/^. 3000 

AAGACAGACG GATGGAAAAC CTAGTTGCAT ATGCTCGGAA AGTTGAAGGG GACATGTATG 3060 

AATCTGCAAA CAATCGAGCG GAATACTACC ACCTTCTAGC TGAGAAAATC TATAAGATCC 312 0 

AGAAAGAACT AGAAGAAAAA CGAAGGACCA GACTACAGAA GCAGAACATG CTACCAAATG 3180 

CTGCAGGCAT GGTTCCAGTT TCCATGAATC CAGGGCCTAA CATGGGACAG CCGCAACCAG 32 40 
GAATGACTTC TAATGGCCCT CTACCTGACC CAAGTATGAT CCGTGGCAGT GTGCCAAACC * 3 3 00 

AGATGATGCC TCGAATAACT CCACAATCTG GTTTGAATCA ATTTGGCCAG ATGAGCATGG 3 3 60 

CCCAGCCCCC TATTGTACCC CGGCAAACCC CTCCTCTTCA GCACCATGGA CAGTTGGCTC 3420 

AACCTGGAGC TCTCAACCCG CCTATGGGCT ATGGGCCTCG TATGCAACAG CCTTCCAACC 34 8 0 

AGGGCCAGTT CCTTCCTCAG ACTCAGTTCC CATCACAGGG AATGAATGTA ACAAATATCC 35 40 

CTTTGGCTCC GTCCAGCGGT CAAGCTCCAG TGTCTCAAGC ACAAATGTCT AGTTCTTCCT 3600 

GCCCGGTGAA CTCTCCTATA ATGCCTCCAG GGTCTCAGGG GAGCCACATT CACTGTCCCC 3660 

AGCTTCCTCA ACCAGCTCTT CATCAGAATT CACCCTCGCC TGTACCTAGT CGTACCCCCA 3720 

CCCCTCACCA TACTCCCCCA AGCATAGGGG CTCAGCAGCC ACCAGCAACA ACAATTCCAG 37 8 0 

CCCCTGTTCC TACACCACCA GCCATGCCAC CTGGGCCACA GTCCCAGGCT CTACATCCCC 38 4 0 

CTCCAAGGCA GACACCTACA CCACCAACAP. CACAACTTCC CCAACAAGTG CAGCCTTCAC 3900 

TTCCTGCTGC ACCTTCTGCT GACCAGCCCC AGCAGCAGCC TCGCTCACAG CAGAGCACAG 3960 

CAGCGTCTGT TCCTACCCCA AACGCACCGC TGCTTCCTCC GCAGCCTGCA ACTCCACTTT 4 02 0 

CCCAGCCAGC TGTAAGCATT GAAGGACAGG TATCAAATCC TCCATCTACT AGTAGCACAG 408 0 

AAGTGAATTC TCAGGCCATT GCTGAGAAGC AGCCTTCCCA GGAAGTGAAG ATGGAGGCCA 4140 

AAATGGAAGT GGATCAACCA GAACCAGCAG ATACGCAGCC GGAGGATATT TCAGAGTCTA 42 00 

AAGTGGAAGA CTGTAAAATG GAATCTACCG AAACAGAAGA GAGAAGCACT GAGTTAAAAA 42 60 

CTGAAATAAA AGAGGAGGAA GACCAGCCAA GTACTTCAGC TACCCAGTCA TCTCCGGCTC 4320 

CAGGACAGTC AAAGAAAAAG ATTTTCAAAC CAGAAGAACT ACGACAGGCA CTGATGCCAA 4380 

CATTGGAGGC ACTTTACCGT CAGGATCCAG AATCCCTTCC CTTTCGTCAA CCTGTGGACC .4440 

CTCAGCTTTT AGGAATCCCT GATTACTTTG" ATATTGTGAA GAGCCCCATG GATCTTTCTA 4500 

GCATTAAGAG GAAGTTAGAC ACTGGACAGT ATCAGGAGCC CTGGCAGTAT GTCGATGATA 4 5 60 

TTTGGCTTAT GTTCAATAAT GCCTGGTTAT ATAACCGGAA AACATCACGG GTATACAAAT 4 62 0 

ACTGCTCCAA GCTCTCTGAG .GTCTTTGAAC AAGAAATTGA CCCAGTGATG CAAAGCCTTG 4680 

GATACTGTTG TGGCAGAAAG TTGGAGTTCT CTCCACAGAC ACTGTGTTGC TACGGCAAAC 47 40 

AGTTGTGCAC AATACCTCGT GATGCCACTT ATTACAGTTA CCAGAACAGG TATCATTTCT 4 8 00 

GTGAGAAGTG TTTCAATGAG ATCCAAGGGG AGAGCGTTTC TTTGGGGGAT GACCCTTCCC. 4860 

AGCCTCAAAC TACAATAAAT AAAGAACAAT TTTCCAAGAG AAAAAATGAC ACACTGGATC 4 92 0 

CTGAACTGTT TGTTGAATGT ACAGAGTGCG GAAGAAAGAT GCATCAGATC TGTGTCCTTC 4 98 0 

ACCATGAGAT CATCTGGCCT GCTGGATTCG TCTGTGATGG CTGTTTAAAG AAAAGTGCAC 5040 

GAACTAGGAA AGAAAATAAG TTTTCTGCTA AAAGGTTGCC ATCTACCAGA CTTGGCACCT 5100 

TTCTAGAGAA TCGTGTGAAT GACTTTCTGA GGCGACAGAA TCACCCTGAG TCAGGAGAGG 5160 
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TCACTGTTAG AGTAGTTCAT GCTTCTGACA AAACCGTGGA AGTAAAACCA GGCATGAAAG 5220 

CAAGGTTTGT GGACAGTGGA GAGATGGCAG AATCCTTTCC ATACCGAACC AAAGCCCTCT 52 8 0 

TTGCCTTTGA AGAAATTGAT GGTGTTGACC TGTGCTTCTT TGGCATGCAT GTTCAAGAGT 53 40 

ATGGCTCTGA CTGCCCTCCA CCCAACCAGA GGAGAGTATA CATATCTTAC CTCGATAGTG 5400 

TTCATTTCTT CCGTCCTAAA TGCTTGAGGA CTGCAGTCTA TCATGAAATC CTAATTGGAT 54 60 

ATTTAGAATA TGTCAAGAAA TTAGGTTACA CAACAGGGCA TATTTGGGCA TGTCCACCAA 5 52 0 

GTGAGGGAGA TGATTATATC TTCCATTGCC ATCCTCCTGA CCAGAAGATA CCCAAGCCCA 5 58 0 

AGCGACTGCA GGAATGGTAC AAAAAAATGC TTGACAAGGC TGTATCAGAG CGTATTGTCC 5 640 

ATGACTACAA GGATATTTTT AAACAAGCTA CTGAAGATAG ATTAACAAGT GCAAAGGAAT 5700 

TGCCTTATTT CGAGGGTGAT TTCTGGCCCA ATGTTCTGGA AGAAAGCATT AAGGAACTGG 57 60 

AACAGGAGGA AGAAGAGAGA AAACGAGAGG AAAACACCAG CAATGAAAGC ACAGATGTGA 5 820 

CCAAGGGAGA CAGCAAAAAT GCTAAAAAGA AGAATAATAA GAAAACCAGC AAAAATAAGA 58 80 

GCAGCCTGAG TAGGGGCAAC AAGAAGAAAC CCGGGATGCC CAATGTATCT AACGACCTCT 5940 

CACAGAAACT ATATGCCACC ATGGAGAAGC ATAAAGAGGT CTTCTTTGTG ATCCGCCTCA 6000 

TTGCTGGCCC TGCTGCCAAC TCCCTGCCTC CCATTGTTGA TCCTGATCCT CTCATCCCCT 6060 

GCGATCTGAT GGATGGTCGG GATGCGTTTC TCACGCTGGC AAGGGACAAG CACCTGGAGT 6120 

TCTCTTCACT CCGAAGAGCC . CAGTGGTCCA CCATGTGCAT GCTGGTGGAG CTGCACACGC 6180 

AGAGCCAGGA CCGCTTTGTC TACACCTGCA ATGAATGCAA GCACCATGTG GAGACACGCT 62 40 

GGCACTGTAC TGTCTGTGAG GATTATGACT " TGTGTATCAC CTGCTATAAC ACTAAAAACC 6300 

ATGACCACAA AATGGAGAAA CTAGGCCTTG GCTTAGATGA TGAGAGCAAC AACCAGCAGG 6360 

CTGCAGCCAC CCAGAGCCCA GGCGATTCTC GCCGCCTGAG TATCCAGCGC TGCATCCAGT 6420 

CTCTGGTCCA TGCTTGCCAG TGTCGGAATG CCAATTGCTC "ACTGCCATCC TGCCAGAAGA 64 80 

TGAAGCGGGT TGTGCAGCAT ACCAAGGGTT GCAAACGGAA AACCAATGGC GGGTGCCCCA 6540 

TCTGCAAGCA GCTCATTGCC CTCTGCTGCT ACCATGCCAA GCACTGCCAG GAGAACAAAT 6600 

GCCCGGTGCC GTTCTGCCTA AACATCAAGC AGAAGCTCCG GCAGCAACAG CTGCAGCACC 6660 

GACTACAGCA GGCCCAAATG CTTCGCAGGA GGATGGCCAG CATGCAGCGG ACTGGTGTGG 6720 

TTGGGCAGCA ACAGGGCCTC CCTTCCCCCA CTCCTGCCAC TCCAACGACA CCAACTGGCC 67 8 0 

AACAGCCAAC CACCCCGCAG ACGCCCCAGC CCACTTCTCA GCCTCAGCCT ACCCCTCCCA 68 40 

ATAGCATGCC ACCCTACTTG CCCAGGACTC AAGCTGCTGG CCCTGTGTCC CAGGGTAAGG 69 00 

CAGCAGGCCA GGTGACCCCT CCAACCCCTC CTCAGACTGC TCAGCCACCC CTTCCAGGGC 69 60 
CCCCACCTAC AGCAGTGGAA ATGGCAATGC AGATTCAGAG' 'AGCAGCGGAG ACGCAGCGCC " 7 02 0 
AGATGGCCCA CGTGCAAATT TTTCAAAGGC CAATCCAACA CCAGATGCCC CCGATGACTC' ;7080 

CCATGGCCCC CATGGGTATG AACCCACCTC CCATGACCAG AGGTCCCAGT GGGCATTTGG "7140 

AGCCAGGGAT GGGACCGACA GGGATGCAGC AACAGCCACC CTGGAGCCAA GGAGGATTGC 72 00 

CTCAGCCCCA GCAACTACAG TCTGGGATGC CAAGGCCAGC CATGATGTCA GTGGCCCAGC 72 60 

ATGGTCAACC. TTTGAACATG GCTCCACAAC CAGGATTGGG CCAGGTAGGT ATCAGCCCAC 732 0 

TCAAACCAGG CACTGTGTCT CAACAAGCCT TACAAAACCT TTTGCGGACT CTCAGGTCTC 73 80 

CCAGCTCTCC CCTGCAGCAG CAACAGGTGC TTAGTATCCT TCACGCCAAC CCCCAGCTGT 7 4 40 

TGGCTGCATT* CATCAAGCAG CGGGCTGCCA AGTATGCCAA CTCTAATCCA CAACCCATCC 7 500 

CTGGGCAGCC TGGCATGCCC CAGGGGCAGC CAGGGCTACA GCCACCTACC ATGCCAGGTC 7 560 

AGCAGGGGGT CCACTCCAAT CCAGCCATGC AGAACATGAA TCCAATGCAG GCGGGCGTTC 7 62 0 

AGAGGGCTGG CCTGCCCCAG CAGCAACCAC AGCAGCAACT CCAGCCACCC ATGGGAGGGA 7 68 0 

TGAGCCCCCA GGCTCAGCAG ATGAACATGA ACCACAACAC CATGCCTTCA CAATTCCGAG 77 40 

ACATCTTGAG ACGACAGCAA ATGATGCAAC AGCAGCAGCA ACAGGGAGCA GGGCCAGGAA 7 8 00 

TAGGCCCTGG AATGGCCAAC CATAACCAGT TCCAGCAACC CCAAGGAGTT GGCTACCCAC 7 8 60 

CACAGCCGCA GCAGCGGATG CAGCATCACA TGCAACAGAT GCAACAAGGA AATATGGGAC 792 0 

AGATAGGCCA GCTTC.CCCAG GCCTTGGGAG CAGAGGCAGG TGCCAGTCTA CAGGCCTATC 7 98 0 

AGCAGCGACT CCTTCAGCAA CAGATGGGGT CCCCTGTTCA GCCCAACCCC ATGAGCCCCC 804 0 

AGCAGCATAT GCTCCCAAAT CAGGCCCAGT CCCCACACCT ACAAGGCCAG CAGATCCCTA 8100 

ATTCTCTCTC CAATCAAGTG CGCTCTCCCC AGCCTGTCCC TTCTCCACGG CCACAGTCCC 8160 

AGCCCCCCCA CTCCAGTCCT. TCCCCAAGGA TGCAGCCTCA GCCTTCTCCA CACCACGTTT 8220 

CCCCACAGAC AAGTTCCCCA CATCCTGGAC TGGTAGCTGC CCAGGCCAAC CCCATGGAAC 82 80 

AAGGGCATTT TGCCAGCCCG GACCAGAATT CAATGCTTTC TCAGCTTGCT AGCAATCCAG 83 4 0 

GCATGGCAAA CCTCCATGGT GCAAGCGCCA CGGACCTGGG ACT.CAGCACC GATAACTCAG 8 4 00 

ACTTGAATTC AAACCTCTCA CAGAGTACAC TAGACATACA CTAGAGACAC CTTGTATTTT 8 4 60 
GGGAGCAAAA AAATTATTTT CTCTTAACAA GACTTTTTGT ACTGAAAACA ATTTTTTTGA . 8 520 

ATCTTTCGTA GCCTAAAAGA CAATTTTCCT TGGAACACAT AAGAACTGTG CAGTAGCCGT 85 80 

TTGTGGTTTA AAGCAAACAT GCAAGATGAA CCTGAGGGAT GATAGAATAC AAAGAATATA 8.64 0 

TTTTTGTTAT GGGCTGGTTA CCACCAGCCT TTCTTCCCCT TTGTGTGTGT GGTTCAAGTG 87 00 

TGCACTGGGA GGAGGCTGAG GCCTGTGAAG CCAAACAATA TGCTCCTGCC TTGCACCTCC 87 60 

AATAGGTTTT ATTATTTTTT TTAAATTAAT GAACATATGT AATATTAATG AACATATGTA 8 82 0 

ATATTAATAG TTATTATTTA CTGGTGCAGA TGGTTGACAT TTTTCCCTAT TTTCCTCACT 88 8 0 

TTATGGAAGA GTTAAAACAT TTCTAAACCA GAGGACAAAA GGGGTTAATG TTACTTTGAA 8940 
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ATTACATTCT ATATATATAT AAATATATAT AAATATATAT TAAAATACCA GTTTTTTTTC 9000 
TCTGGGTGCA AAGATGTTCA TTCTTTTAAA AAATGTTTAA AAAAAA 904 6 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7326 base pairs. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGGCCGAGA ACTTGCTGGA CGGACCGCCC AACCCCAAAC GAGCCAAACT CAGCTCGCCC 60 

GGCTTCTCCG CGAATGACAA CACAGATTTT GGATCATTGT TTGACTTGGA AAATGACCTT 12 0 

CCTGATGAGC TGATCCCCAA TGGAGAATTA AGCCTTTTAA ACAGTGGGAA CCTTGTTCCA 18 0 

GATGCTGCGT CCAAACATAA ACAACTGTCA GAGCTTCTTA GAGGAGGCAG CGGCTCTAGC 240 

ATCAACCCAG GGATAGGCAA TGTGAGTGCC AGCAGCCCTG TGCAACAGGG CCTTGGTGGC . 300 

CAGGCTCAGG GGCAGCCGAA CAGTACAAAC ATGGCCAGCT TAGGTGCCAT GGGCAAGAGC 3 60 

CCTCTGAACC AAGGAGACTC ATCAACACCC AACCTGCCCA AACAGGCAGC CAGCACCTCT 42 0 

GGGCCCACTC CCCCTGCCTC CCAAGCACTG AATCCACAAG CACAAAAGCA AGTAGGGCTG 48 0 

GTGACCAGTA GTCCTGCCAC ATCACAGACT GGACCTGGGA TCTGCATGAA TGCTAACTTC 54 0 

AACCAGACCC ACCCAGGCCT TCTCAATAGT AACTCTGGCC ATAGCTTAAT GAATCAGGCT 600 

CAACAAGGGC AAGCTCAAGT CATGAATGGA TCTCTTGGGG CTGCTGGAAG AGGAAGGGGA 6 60 

GCTGGAATGC CCTACCCTGC TCCAGCCATG CAGGGGGCCA CAAGCAGTGT GCTGGCGGAG 72 0 

ACCTTGACAC AGGTTTCCCC ACAAATGGCT GGCCATGCTG GACTAAATAC AGCACAGGCA 78 0 

GGAGGCATGA CCAAGATGGG AATGACTGGT ACCACAAGTC CATTTGGACA ACCCTTTAGT 84 0 

CAAACTGGAG GGCAGCAGAT GGGAGCCACT GGAGTGAACC CCCAGTTAGC CAGCAAACAG 90 0 

AGCATGGTCA ATAGTTTACC TGCTTTTCCT ACAGATATCA AGAATACTTC AGTCACCACT 9 60 

GTGCCAAATA TGTCCCAGTT GCAAACATCA GTGGGAATTG TACCCACACA AGCAATTGCA 102 0 

ACAGGCCCCA CAGCAGACCC TGAAAAACGC AAACTGATAC AGCAGCAGCT GGTTCTACTG 108 0 

CTTCATGCCC ACAAATGTCA GAGACGAGAG CAAGCAAATG GAGAGGTTCG NGCCTGTTCT 114 0 

CTCCCACACT GTCGAACCAT GAAAAACGTT TTGAATCACA TGACACATTG TCAGGCTCCC 12 0 0 

AAAGCCTGCC AAGTTGCCCA TTGTGCATCT TCACGACAAA TCATCTCTCA TTGGAAGAAC 12 60 

TGCACACGAC ATGACTGTCC TGTTTGCCTC CCTTTGAAAA ATGCCAGTGA CAAGCGAAAC 132 0 

CAACAAACCA TCCTGGGATC TCCAGCTAGT GGAATTCAAA ACACAATTGG TTCTGTTGGT 138 0 

GCAGGGCAAC AGAATGCCAC TTCCTTAAGT AACCCAAATC , CCATAGACCC CAGTTCCATG 144 0 

CAGCGGGCCT ATGCTGCTCT AGGACTCCCC TACATGAACC AGCCTCAGAC GCAGCTGCAG 150 0 

CCTCAGGTTC CTGGCCAGCA ACCAGCACAG CCTCCAGCCC ACCAGCAGAT GAGGACTCTC 15 60 

AATGCCCTAG GAAACAACCC CATGAGTGTC CCAGCAGGAG GAATAACAAC AGATCAACAG 162 0 

CCACCAAACT TGATTTCAGA ATCAGCTCTT CCAACTTCCT TGGGGGCTAC CAATCCACTG 168 0 

ATGAATGATG GTTCAAACTC TGGTAACATT GGAAGCCTCA GCACGATACC TACAGCAGCG 17 4 0 

CCTCCTTCCA GCACTGGTGT TCGAAAAGGC TGGCATGAAC ATGTGACTCA GGACCTACGG 18 00 

AGTCATCTAG TCCATAAACT CGTTCAAGCC ATCTTCCCAA CTCCAGACCC TGCAGCTCTG 18 60 

AAAGATCGCC GCATGGAGAA CCTGGTTGCC TATGCTAAGA AAGTGGAGGG AGACATGTAT 192 0 

GAGTCTGCTA ATAGCAGGGA TGAATACTAT CATTTATTAG CAGAGAAAAT CTATAAAATA 198 0 

CAAAAAGAAC TAGAAGAAAA GCGGAGGACA CGTTTACATA AGCAAGGCAT CCTGGGTAAC 2 040 
CAGCCAGCTT TACCAGCTTC TGGGGCTCAG CCCCCTGTGA TTCCACCAGC CCAGTCTGTA . 2100 

AGACCTCCAA ATGGGCCCCT GCCTTTGCCA GTGAATCGCA TGCAGGTTTC TCAAGGGATG 2160 

AATTCATTTA ACCCAATGTG CCTGGGAAAC GTCCAGTTGC CACAGGCACC CATGGGACCT 222 0 

CGTGCAGCCT CCCCTATGAA CCACTCTGTG CAGATGAACA GCATGGCCTC AGTTCCGGGT 22 8 0 

ATGGCCATTT CTCCTTCACG GATGCCTCAG CCTCCAAATA TGATGGGCAC TCATGCCAAC 2 34 0 

AACATTATGG CCCAGGCACC TACTCAGAAC CAGTTTCTGC CACAGAACCA GTTTCCATCA 240 0 

TCCAGTGGGG CAATGAGTGT GAACAGTGTG GGCATGGGGC AACCAGCAGC CCAGGCAGGT 2 4 60 

GTTTCACAGG GTCAGGAACC TGGAGCTGCT CTCCCTAACC CTCTGAACAT GCTGGCACCC 2 52 0 

CAGGCCAGCC AGCTGCCTTG CCCACCAGTG ACACAGTCAC CATTGCACCC GACTCCACCT 2 58 0 

CCTGCTTCCA CAGCTGCTGG CATGCCCTCT CTCCAACATC CAACGGCACC AGGAATGACC 2 64 0 

CCTCCTCAGC CAGCAGCTCC CACTCAGCCA TCTACTCCTG TGTCATCTGG GCAGACTCCT 2 7 00 
ACCCCAACTC CTGGCTCAGT GCCCAGCGCT GCCCAAACAC AGAGTACCCC TACAGTCCAG , 2 7 60 

GCAGCAGCAC AGGCTCAGGT GACTCCACAG . CCTCAGACCC CAGTGCAGCC ACCATCTGTG 2 32 0 

GCTACTCCTC AGTCATCACA 'GCAGCAACCA ACGCCTGTGC ATACTCAGCC ACCTGGCACA 2880 

CCGCTTTCTC AGGCAGCAGC CAGCATTGAT AATAGAGTCC CTACTCCCTC CACTGTGACC 294 0 
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AGTGCTGAAA CCAGTTCCCA GCAGCCAGGA CCCGATGTGC CCATGCTGGA AATG/^AGACA , 3000 

GAGGTGCAGA CAGATGATGC TGAGCCTGAA CCTACTGAAT CCAAGGGGGA ACCTCGGTCT. 30 60 

GAGATGATGG AAGAGGATTT ACAAGGTTCT TCCCAAGTAA - AAGAAGAGAC AGATACGACA 3120 

GAGCAGAAGT CAGAGCCAAT GGAAGTAGAA GAAAAGAAAC CTGAAGTAAA AGTGGAAGCT 318 0 

AAAGAGGAAG AAGAGAACAG TTCGAACGAC ACAGCCTCAC AATCAACATC TCCTTCCCAG 32 4 0 

CCACGCAAAA AAATCTTTAA ACCCGAGGAG CTACGCCAGG CACTTATGCC AACTCTAGAA 3300 

GCACTCTATC GACAGGACCC AGAGTCTTTG CCTTTTCGTC AGCCTGTAGA TCCTCAGCTC "3 3 60 

CTAGGAATCC CAGATTATTT TGATATAGTG AAGAATCCTA TGGACCTTTC TACCATCAAA 342 0 

CGAAAGCTGG ACACAGGGCA ATATCAAGAA CCCTGGCAGT ATGTGGATGA TGTCAGGCTT 348 0 

ATGTTCAACA ATGCGTGGCT ATATAATCGT AAAACGTCCC GTGTATATAA ATTTTGCAGT 3 54 0 

AAACTTGCAG AGGTCTTTGA ACAAGAAATT GACCCTGTCA TGCAGTCTCT TGGATATTGC 3600 

TGTGGACGAA AGTATGAGTT CTCCCCACAG ACTTTGTGCT GTTACGGAAA GCAGCTGTGT 3660 

ACAATTCCTC GTGATGCAGC CTACTACAGC TATCAGAATA GGTATCATTT CTGTGGGAAG 37 2 0 

TGTTTCACAG AGATCCAGGG CGAGAATGTG ACCCTGGGTG ACGACCCTTC CCAACCTCAG 37 3 0 

ACGACAATTT CCAAGGATCA ATTTGAAAAG AAGAAAAATG ATACCTTAGA TCCTGAACCT 384 0 

TTTGTTGACT GCAAAGAGTG TGGCCGGAAG ATGCATCAGA TTTGTGTTCT ACACTATGAC 3900 

ATCATTTGGC CTTCAGGTTT TGTGTGTGAC AACTGTTTGA AGAAAACTGG CAGACCTCGG 3960 

AAAGAAAACA AATTCAGTGC TAAGAGGCTG CAGACCACAC GATTGGGAAA CCACTTAGAA 4020 

GACAGAGTGA ATAAGTTTTT GCGGCGCCAG AATCACCCTG AAGCTGGGGA GGTTTTTGTC 408 0 

AGAGTGGTGG CCAGCTCAGA CAAGACTGTG GAGGTCAAGC CGGGAATGAA GTCAAGGTTT 414 0 

GTGGATTCTG GAGAGATGTC GGAATCTTTC CCATATCGTA CCAAAGCACT CTTTGCTTTT 4200 

GAGGAGATCG ATGGAGTCGA TGTGTGCTTT TTTGGGATGC ATGTGCAAGA TACGGCTCTG 42 60 

ATTGCCCCCC ACCAAATACA AGGCTGTGTA TACATATCTT ATCTGGACAG TATTCATTTC 432 0 

TTCCGGCCCC GCTGCCTCCG GACAGCTGTT TACCATGAGA TCCTCATCGG ATATCTCGAG 4 38 0 

TATGTGAAGA AATTGGTGTA TGTGACAGCA CATATTTGGG CCTGTCCCCC AAGTGAAGGA 4 44 0 

GATGACTATA TCTTTCATTG CCACCCCCCT GACCAGAAAA TCCCCAAACC AAAACGACTA 4500 

CAGGAGTGGT ACAAGAAGAT GCTGGACAAG GCGTTTGCAG AGAGGATCAT TAACGACTAT 4560 

AAGGACATCT TCAAACAAGC GAACGAAGAC AGGCTCACGA GTGCCAAGGA GTTGCCCTAT 4 62 0 

TTTGAAGGAG ATTTCTGGCC TAATGTGTTG GAAGPJMKGCA TTAAGGAACT AGAACAAGAA 4 68 0 

GAAGAAGAAA GGAAAAAAGA ' AGAGAGTACT GCAGCGAGTG AGACTCCTGA. GGGCAGTCAG 47 4 0 

GGTGACAGCA AAAATGCGAA GAAAAAGAAC AACAAGAAGA- CCAACAAAAA CAAAAGCAGC 4 80 0 

ATTAGCCGCG CCAACAAGAA GAAGCCCAGC ATGCCCAATG TTTCCAACGA CCTGTCGCAG' '^'4 8 60 

AAGCTGTATG CCACCATGGA GAAGCACAAG GAGGTATTCT TTGTGATTCA TCTGCATGCT 492 0 

GGGCCTGTTA TCAGCACTCA GCCCCCCATC GTGGACCCTG ATCCTCTGCT TAGCTGTGAC 4 98 0 

CTCATGGATG GGCGAGATGC CTTCCTCACC CTGGCCAGAG ACAAGCACTG GGAATTCTCT 504 0 

TCCTTACGCC GCTCCAAATG GTCCACTCTG TGCATGCTGG TGGAGCTGCA CACACAGGGC 5100 

CAGGACCGCT TTGTTTATAC CTGCAATGAG TGCAAACACC ATGTGGAAAC ACGCTGGCAC 5160 

TGCACTGTGT GTGAGGACTA TGACCTTTGT ATCAATTGCT ACAACACAAA GAGCCACACC ' 5220 

CATAAGA.TGG TGAAGTGGGG GCTAGGCCTA GATGATGAGG GCAGCAGTCA GGGTGAGCCA 528 0 

CAGTCCAAGA GCCCCCAGGA ATCCCGGCGT CTCAGCATCC AGCGCTGCAT CCAGTCCCTG • 5 3 40 

GTGCATGCCT GCCAGTGTCG CAATGCCAAC TGCTCACTGC CGTCTTGCCA GAAGATGAAG 54 0 0 

CGAGTCGTGC AGCACACCAA GGGCTGCAAG CGCAAGACTA ATGGAGGATG CCCAGTGTGC 5 460 

AAGCAGCTCA TTGCTCTTTG CTGCTACCAC GCCAAACACT GCCAAGAAAA TAAATGCCCT" 5 520 

GTGCCCTTCT GCCTCAACAT CAAACATAAC GTCCGCCAGC AGCAGATCCA GCACTGCCTG 5 58 0 

CAGCAGGCTC AGCTCATGCG CCGGCGAATG GCAACCATGA ACACCCGCAA TGTGCCTCAG 564 0 

CAGAGTTTGC CTTCTCCTAC CTCAGCACCA CCCGGGACTC CTACACAGCA GCCCAGCACA 57 00 

CCCCAAACAC CACAGCCCCC AGCCCAGCCT CAGCCTTCAC CTGTTAACAT GTCACCAGCA 57 60 

GGCTTCCCTA ATGTAGCCCG GACTCAGCCC CCAACAATAG TGTCTGCTGG GAAGCCTACC 5820 

AACCAGGTGC CAGCTCCCCC ACCCCCTGCC CAGCCCCCAC CTGCAGCAGT AGAAGCAGCC 588 0 

CGGCAAATTG AACGTGAGGC. CCAGCAGCAG CAGCACCTAT AGCGAGCAAA CATCAACAAT 5 94 0 

GGCATGCCCC CAGGACGTGA CGGTATGGGG ACCCCAGGAA GCCAAATGAC TCCTGTGGGC 6000 

CTGAATGTGC CCCGTCCCAA CCTV^GTCAGT GGGCCTGTCA TGTCTAGTAT GCCACCTGGG 6060 

CAGTGGCAGC AGGCACCCAT _CCCTCAGCAG CAGCCGATGC CAGGCATGCC CAGGCCTGTA 6120 

ATGTCCATGC AGGCCCAGGC "aGCAGTGGCT GGGCCACGGA TGCCCAATGT GCAGCCAAAC 6180 

AGGAGCATCT CGCCAAGTGC CCTGCAAGAC • CTGCTACGGA CCCTAAAGTC ACCCAGCTCT 6240 

CCTCAGCAGC AGCAGCAGGT GCTGAACATC CTTAAATCAA ACCCACAGCT AATGGCAGCT 6300 

TTCATCAAAC AGCGCACAGC CAAGTATGTG GCCAATCAGC CTGGCATGCA GCCCCAGCCC 63 60 

GGACTTCAAT CCCAGCCTGG TATGCAGCCC CAGCCTGGCA TGCACCAGCA GCCTAGTTTG 6420 

CAAAACCTGA ACGCAATGCA AGCTGGTGTG • CCACGGCCTG GTGTGCCtCC ACCACAACCA 64 80 

GCAATGGGAG GCCTGAATCC CCAGGGACAA GCTCTGAACA TCATGAACCC AGGACACAAC 65 40 

CCCAACATGA, CAAACATGAA^TCCACAGTAC CGAGAAATGG TGAGGAGACA GCTGCTACAG 6600 

CACCAGCAGC' AGCAGCAGCA ' ACAGCAGCAG CAGCAGCAGC AACAACAAAA TAGTGCCAGC 6660 

TTGGCCGGGG GCATGGCGGG ACACAGCCAG TTCCAGCAGC CACAAGGACC TGGAGGTTAT 67 2 0 
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GCCCCAGCCA TGCAGCAGCA ACGCATGCAA CAGCACCTCC CCATCCAGGG CAGCTCCATG 
GGCCAGATGG CTGCTCCAAT GGGACAACTT GGCCAGATGG GGCAGCCTGG GCTAGGGGCA 
GACAGCACCC CTAATATCCA GCAGGCCCTG CAGCAACGGA TTCTGCAGCA GCAGCAGATG 
AAGCAACAAA TTGGGTCACC AGGCCAGCCG AACCCCATGA GCCCCCAGCA GCACATGCTC 
TCAGGACAGC CACAGGCCTC ACATCTCCCT GGCCAGCAGA TCGCCACATC CCTTAGTAAC 
CAGGTGCGAT CTCCAGCCCC TGTGCAGTCT CCACGGCCCC AATCCCAACC TCCACATTCC 
AGCCCGTCAC CACGGATACA ACCCCAGCCT- TCACCACACC ATGTTTCACC CCAGACTGGA 
ACCCCTCACC CTGGACTCGC AGTCACCATG GCCAGCTCCA TGGATCAGGG ACACCTGGGG 
AACCCTGAAC AGAGTGCAAT GCTCCCCCAG CTGAATACCC CCAACAGGAG CGCACTGTCC 
AGTGAACTGT CCCTGGTTGG TGATACCACG GGAGACACAC TAGAAAAGTT TGTGGAGGGT 
TTGTAG 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2499 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single , 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 4 : 

TCACTTGTCA ATTAATCCAG CTTCCTTAAT TTTACTGAAG AAGAATTTCT CCAGGATATT 
GGCACATTTG TAGTATTCAC TCTCAGGGGC GTTGTACTCT TTGCAATTGG TAAAGACTCG 
CTGTAAGTCT GCCATGAATA ATTTCTTAGA CACGTAGTAC CTATTCTTGA GGCGTTCACT 
CATGGTTTTG AGATCCATGG GGGACCTTAT AACTTCATAA TATCCTGGAG CTTCTGTTCT 
CTTCACAGGT TCCATGAAGG GCCAAGCGCT TTGATGGCTC TTCACCTGCT GGAGGATGCT 
CTTGAGCGTG CTGTAAAGCT GGTCAGGGTC TCTGGGCTCT TTACTTTTCT CTTTTCCACT 
CGGTTTCCAG CCTGTCTCTC TAATTCCAGG AATGCTTTCT ATAGGAATCT • GTCGAACTCC 
ATCTTTAAAA CATGAAAGTC CAGGGTAAAC .TTTTCGAATT TGTGCCTGTT TTCTTTCAAT 
CAGTTTTTTA- ATTATCTCCT TCTGCTTTTT AATGATGACA GAAAATTCTG TGTACGGGAT 
CCGTGGATTT AGCTCACATC CCATTAAAGT GGCTCCTTCA TAATCCTTGA TATAGCCAAC 
ATATTTGGTT TTAGGTATTT TAATTTCTTT GGAGAAACCC TGTTTCTTAA AGTATCCAAT 
TGCATATTCA TCTGCATATG TGAGGAAGTT CAGGATGTCA TGCTTTATGT GATATTCTTT 
CAAATGATTC ATCAGGTGTG TTCCATAGCC CTTGACTTGC TCATTTGAGG TTACAGCACA 
GAAGACAATC T.CTGTGAATC CTTGAGATGG GAACATACGG AAACAGATAC CACCAATAAC 
ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GCCGTGTGAT 
GTATTCTTTT GGCATTCGGG GCAGCTGGTG GGAGAAAACG TTCTGTAGGC CAACCAGCCA 
CATCAGGATC TTCTTGTTTG GTTTCTGGTT GAGGGAATTG CCAACCACGT GAAATTCAAT 
TACACCCCTG CGCTCTTCCA ACCTTGCCGC CTCATCCCTG GCCGAGTGTG CTGACAGAAA 
ATTGGTCTCT GGTCCAAGCA TTGCTGCAGG GTCCGTGATG GTAGACATAA CCTCGTTGAT-' 
TAATTCCATC GGAATATCCC CCATAACTCG GGGTTTCTTG GCCTCCTCCA GAACATGAGA 
ATCAGTCATT TTCCTCTTTT CTCCTGGGTT TGCCTCAAGT CCAGAAGAGG CTTTGCAGGC 
AGGACTGCTG CTCCCTGCGT TTGGCTGCTC AAGGGAAGAT GAGGTTGAAT TGTATGAAAT 
TGTCCCAGCC ACAGGAGGTG GATTGATAAC TGTTTGGATG CCTAGCTGGC- TGGTTCTGGA 
AGAGGCTGAG AGAAAATCCT GATCCCAGAT GGGAGAGTTT TGACTATATA CTTCTTCTTC 
TAGCATGGAC AGAAATTTTG GGAAATGAGT GAGGATTAGA GTTCGTTTTT CAAGAGGCAG 
TTTATCTTTT TCCTGTCTTG CTTGTTCCAG GAGTTGTCGC CTCATAACAG TGAAGACCGA 
GCGAAGCAAT GTTCTCCCAA ACACCTGTGT GGTTTCGTAC CGAGGTAGAC TGTCGCAGAA 
CTGTGGCACG TTGCAGTAAC ACAGCCACCT TGTGTAGTTC TCTTTGTATC CAGAAATATC 
ATCATTGGGA GATCGCAGTC TTCGTTGAGA TGGTGCCTCC AGATGCCAAT AGTTGATGCG 
GTTTAGGAAC ATTTTTGCCA ACTCAACTAT TGTTTGCCTT TCTTTTGCTG GCAGGTGACT 
AAATTTGTAC TGCACAAAGT TATTCACACC CTGTTCAATG CTAGGTTTTT CAAATGGGGG 
TTTCTTTTCC AAAGAGCCTT CAACCACAGG TTTTCCTCTT TGTAAAATAG ACTTTCTCAA 
GAGCTTAAAT AGATAGAAAT AAACTTGTTT GGTATCTGCA TCTTCTTCCT TGTGGACACA 
GGTAAAGAGA TATTCCACAT CCAATACTAT TCCCAGGAGT CTGTTCATTT CTTCCTCTGA 
CACATTCTCC AGGTGGGAAA CATGAGCAGC TAGGGCATGG CTACAACTCC GACAGGATTC 
TGTTAGACTG ACAATTATTT GCTGCAGGTC GGCTCTGGGG GGAGTGGGTG AGGGGTTAGG 
GTTTTTCCAG CCATTACATT TACAAGACTC CTCGGCCTTG CAGGCGGAGT ACACTCCGAG 
TTTCTCCAGT TTCTTGGCCC GCGGAGCGGA GCGTAGTTGC GCTTTCTTCA CGGCGATTCG 
GGCCGAGCCA CCGCCTCCCG GTCCTTCGGC CGTGCCCGCT GCAGCCACTG CCGTCGCCGG 
ACCGCAGGCG CCCGAGCCCC CGGCGGCAGC GGCGCAGGGG GAGCCCTGCG GGGGCGCGGG 



6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7326 



60 
120 
180 
240 
300 
360 
420 
'4 80 
540 
600 
660 
720 
780 
.840 
900 
960 
1020 
1030 
1140 
1200 
1260 
1320 
1380 
14 4 0 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
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CGGAAGCGCC GCAGGCTGCG GGGGCAGCGC CCCGGGCCCG GCCCCTGCCC CGGCTCCTGC 2 460 
CCCGCAGCCG CCCGGCCCGG CCCCGCCAGC CTCGGACAT 2 499 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCACTTGTCA ATCAACCCTG CTTCCTTAAT TTTACTGAAG AAGAACTTCT CCAGGATGCT 60 

GGCGCATTTG TAGTACTCGC TCTCGGGAGG GTTGTACTCC TTGCAGTTGG TGAACACTCG 12 0 

TTGCAAGTCC GCCATGAATA ACTTCTTAGA CACATAGTAC CTGTTCCTGA GGCGTTCACT 18 0 

CATGGTTTTC AGATCCATGG GGAACCTTAT AACTTCATAA TATCCCGGAG CTTCTGTTCT 2 40 

CTTCACTGGT TCCATGAAAG GCCAAGCATT TGGATGGTTC TTCACCTGCT GCAGGATGTT 300 

CTTGAGGGTG CTGTAAACGT GCTCAGGGTC TTTGGGCTCT TTACTTTTCT CTTTTCCACT 360 

TGGTTTCCAG CCTGTCTCTC TGATTCCAGG AATGCTTTCT ATAGGAATCT GCCGAACTCC 42 0 

ATCTTTGAAA CACGAAAGTC CAGGGTAGAC TTTTCGAATC TGGGCTTGTT TTCTTTCTAT 48 0 

CAGCTTTTTA ATGATCTCCT TCTGCTTTTT AATGATGACA GAGAACTCTG TGTATGGGAT 54 0 

CTGAGGGTTC AGCTCACATC CCATCAAAGT GGCCCCTTCA TAATCCTTGA TGTAGCCAAC 600 

ATATTTGGTT TTAGGTATTT TGATTTCTTT GGAGAAACCC TGCTTCTTGA AATAGCCGAT 660 

GGCATACTCA TCTGCATATG TGAGGAAGTT GAGGATCTCG TGCTTTATGT GGTATTCTTT 72 0 

GAGATGGTTC ATCAGGTGGG TTCCATAGCC CTTGACTTGT TCATTTGAGG TTACTGCACA 78 0 

GAAAACAATC TCTGTGAATC CCTGGGATGG AAACATCCGG AAACAGATAC CACCAATGAC 84 0 

ACGGCCATCT TTAATTAAAG CAAGGGTTTT GTGTTTCGGG TCAAAGACGA GCCGTGTGAT 900 

GTACTCTTTG GGCATTCTGG GCAGCTGGTG GGAAAACACA TTCTGGAGGC CCACGAGCCA 960 

CATCAGGATC TTCTTGTTTG GTTTCTGGTT, CAGGGAGTTG ' CCCACCACGT GGAATTCAAT ' 1020 
GACACCCCTG CGTTCTTCCA GCCGTGCCGC CTCATCTCTG GCCGAATGGG CTGACAGAAA' " i'OSO 

ATTGGTCTCT GGTCCAAGCA TCCCTGCAGG GTCTGTGATG GTAGACATGA CCTCATTGAT 114 0 

CAATTCCACG GGAATATCCC CCATCACTCG AGATCTCTTG GCCTCCTCGG GAGCATGAGA 1200 

GTTGTTCATT TTCCTCTTTT CTCCCGGGTT TGCTTCAAGC CCAGAAGAGC CTCTGCATCC 12 60 

AGGACTTGTT CTCCCTCCAT TGATCTGCTC ATGGGAAGTT GAATTTGAAC TGAACAATGC 132 0 

TGTCCCAGTA ACAGGAGGAC . T GAT TACT GT TTGGATTCCT AGCGGGCTGG TTCTGGAAGA 1380 

GGCTGAGAGA AAATCCTGAT CCCAGATAGG AGAATTTTGA CTATACACTT CTTCTTCCAA 1440 

CATGGACAGA AACTTTGGGA AATGTGTGAG GATAAGCGTG CGTTTCTCAA GAGGCAGTTT 1500 

GTCTTTTTTC TGTCTGGCTT GTTCCAAGAG CTGTCGTCTC ATGATGGTGA AGACCGAGCG 1560 

AAGCAATGTT CTCCCAAACA CCTTTGTGGT TTCGTACCGA GGTAAGCTGT CACAGAACTG 1620 

CGGTACATTG CAGTAGCACA ACCACCTTGT GTAGTTTTCC TTGTATCCAG AGATGTCATC 168 0 

ATTGGGAGAC CGTAGTCTCC GCTGAGATGG AGCCTCCAGA TGCCAGTAGT TGATGCGGTT 17 4 0 

CAGAAACATC TTGGCCAGCT CGATCGTTGT CTGCCTCTCT TTCGATGGCA AGTGACTAAA 18 00 

CTTGTACTGC ACGAAGTTGT TCACACCCTG TTCAATACTG GGCTTCTCAA ATGGCGGCTT 18 60 

CTTCTCCAAG GAGCCTTCAA CCACAGGTTT TCCTCTTTGT AAAATTGACT TTCTCAAGAG 192 0 

CTTGAATAGG TAGAAGTACA CTTGTTTGGT ATCTGCATCT TCTTCTTTGT GGACGCAGGT 198 0 

GAAGAGGTAC TCCACATCCA ACACAATTCC CAGGAGTCTG TCCATCTCTT CCTCTGACAC 2 0 40 

ATTCTCCAAG TGAGAAACGT GAGCAGCAAG GGCATGGCTA CAGCTTCGAC AGGATTCTGT 210 0 

CAAACTGACA ATTATCTGCT GGAGGTCTCC TCTTGGTGGA GTAGGAGAGG GGTTAGGGTT 2160 

CTTCCAGCCA TTGCATTTAC AGGACTCCTC TGCCTTGCAG GCGGAGTACA CGCCGAGTTT 2220 

CTCCAGCTTC TTCGCCCGCG GAGCAGAGCG CAACTGCGCC TTCTTCACGG CGATCCGGGC 228 0 

CGAGCCGCCT CCTCCCGGTC CCTCGGCGGT GCCCGCCGCG GCCACCGGCG TCGCTGGCCC 2 34 0 

GCAGGAAGCA GAGCTCCCGG CAGCGGTGGC CAGGGTCCGG GGGGAACCGT GCGGGGGCGC 2 4 00 

GGGAGGCAGT GCTGGGGACC CGGCCCCGCC AGCCTCGGCC AT' 24 42 

(2) INFORMATION FOR SEQ ID N0:16: , 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 6 : 
CCCGCCAGCC TCGGACATGC 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CCCGCCAGCC TCGGCCATGC 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2442. base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

ATGGCCGAGG CTGGCGGGGC CGGGTCCCCA GCACTGCCTC CCGCGCCCCC, GCACGGTTCC. 
CCCCGGACCC TGGCCACCGC TGCCGGGAGC TCTGCTTCCT GCGGGCCAGC GACGCCGGTG 
GCCGCGGCGG GCACCGCCGA GGGACCGGGA GGAGGCGGCT CGGCCCGGAT CGCCGTGAAG 
AAGGCGCAGT TGCGCTCTGC TCCGCGGGCG AAGAAGCTGG AGAAACTCGG CGTGTACTCC 
GCCTGCAAGG CAGAGGAGTC CTGTAAATGC AATGGCTGGA AGAACCCTAA CCCCTCTCCT 
ACTCCACCAA GAGGAGACCT CCAGCAGATA ATTGTCAGTT TGACAGAATC CTGTCGAAGC 
TGTAGCCATG CCCTTGCTGC TCACGTTTCT CACTTGGAGA ATGTGTCAGA GGAAGAGATG 
GACAGAC5KC TGGGAATTGT GTTGGATGTG GAGTACCTCT TCACCTGCGT CCACAAAGAA 
GAAGATGCAG ATACCAAACA AGTGTACTTC TACCTATTCA AGCTCTTGAG AAAGTCAATT 
TTACAAAGAG GAAAACCTGT GGTTGAAGGC TCCTTGGAGA AGAAGCCGCC ATTTGAGAAG 
CCCAGTATTG AACAGGGTGT GAACAACTTC GTGCAGTACA AGTTTAGTCA ' CTTGCCATCG 
AAAGAGAGGC AGACAACGAT CGAGCTGGCC AAGATGTTTC TGAACCGCAT CAACTACTGG 
CATCTGGAGG CTCCATCTCA GCGGAGACTA CGGTCTCCCA ATGATGACAT CTCTGGATAC 
AAGGAAAACT ACACAAGGTG GTTGTGCTAC TGCAATGTAC CGCAGTTCTG TGACAGCTTA 
CCTCGGTACG AAACCACAAA GGTGTTTGGG AGAACATTGC TTCGCTCGGT CTTCACCATC 
ATGAGACGAC AGCTCTTGGA ACAAGCCAGA CAGAAAAAAG ACAAACTGCC TCTTGAGAAA 
CGCACGCTTA TCCTCACACA TTTCCCAAAG TTTCTGTCCA TGTTGGAAGA AGAAGTGTAT, 
AGTCAAAATT CTCCTATCTG GGATCAGGAT TTTCTCTCAG CCTCTTCCAG AACCAGCCCG 
CTAGGAATCC AAACAGTAAT CAGTCCTCCT GTTACTGGGA CAGCATTGTT CAGTTCAAAT 
TCAACTTCCC ATGAGCAGAT CAATGGAGGG AGAACAAGTC CTGGATGCAG AGGCTCTTCT 
GGGCTTGAAG CAAACCCGGG AGAAAAGAGG AAAATGAACA ACTCTCATGC TCCCGAGGAG 
GCCAAGAGAT CTCGAGTGAT GGGGGATATT CCCGTGGAAT TGATCAATGA GGTCATGTCT 
ACCATCACAG ACCCTGCAGG GATGCTTGGA CCAGAGACCA ATTTTCTGTC AGCCCATTCG 
GCCAGAGATG AGGCGGCACG GCTGGAAGAA CGCAGGGGTG TCATTGAATT CCACGTGGTG 
GGCAACTCCC TGAACCAGAA ACCAAACAAG AAGATCCTGA TGTGGCTCGT GGGCCTCCAG 
AATGTGTTTT CCCACCAGCT GCCCAGAATG CCCAAAGAGT ACATCACACG GCTCGTCTTT 
GACCCGAAAC ACAAAACCCT TGCTTTAATT AAAGATGGCC GTGTCATTGG TGGTATCTGT 
TTCCGGATGT TTCCATCCCA GGGATTCACA GAGATTGTTT TCTGTGCAGT AACCTCAAAT 
GAACAAGTCA AGGGCTATGG AACCCACCTG ATGAACCATC TCAAAGAATA CCACATAAAG 
CACGAGATCC TCAACTTCCT CACATATGCA GATGAGTATG CCATCGGCTA TTTCAAGAAG 
CAGGGTTTCT CCAAAGAAAT CAAAATACCT AAAACC7\AAT ATGTTGGCTA CATCAAGGAT 
TATGAAGGGG CCACTTTGAT GGGATGTGAG CTGAACCCTC AGATCCCATA CACAGAGTTC 
TCTGTCATCA TTAAAAAGCA GAAGGAGATC ATTAAAAAGC TGATAGAAAG AAAACAAGCC 
CAGATTCGAA AAGTCTACCC TGGACTTTCG TGTTTCAAAG ATGGAGTTCG GCAGATTCCT 



20 
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ATAGAAAGCA TTCCTGGAAT CAGAGAGACA GGCTGGAAAC CAAGTGGAAA AGAGAAAAGT 2100 

AAAGAGCCCA AAGACCCTGA GCACGTTTAC AGCACCCTCA AGAACATCCT GCAGCAGGTG 2160 

AAGAACCATC CAAATGCTTG GCCTTTCATG GAACCAGTGA AGAGAACAGA AGCTCCGGGA 2 22 0 

TATTATGAAG TTATAAGGTT CCCCATGGAT CTGAAAACCA TGAGTGAACG CCTCAGGAAC 22 8 0 

AGGTACTATG TGTCTAAGAA GTTATTCATG GCGGACTTGC AACGAGTGTT CACCAACTGC 2 340 

AAGGAGTACA ACCCTCCCGA GAGCGAGTAC TACAAATGCG CCAGCATCCT GGAGAAGTTC 2 4 00 

TTCTTCAGTA AAATTAAGGA AGCAGGGTTG ATTGACAAGT GA 2 4 42 
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What is claimed is: 

1 . A purified protein designated P/C AF having a molecular weight of about 93,000 
daltons as determined by sodium dodecyl sulfate polyacrylamide gel electrophoresis 
under reducing conditions and which acetylates histones, 

2. The protein of claim 1 consisting of the amino acid sequence of SEQ ID NO: 1 . 

3. The protein of claim 1 comprising the amino acid sequence of SEQ ID NO:2. 

4. The protein of claim 1, which also binds to the amino acid sequence of SEQ ED 
N0:3 on a p300 cellular protein and to amino acid residues 1805-1854 of a CBP cellular 
protein (SEQ ID NO:9). 

5. A fragment of the protein of claim 1 having histpne acetyltransferase activity. 

6. A polypeptide consisting of the amino acid sequence of SEQ ID NO: 2 

7. A fragment of the protein of claim 1 which binds to the amino acid sequence of 
SEQ ID NO: 3 on the p300 cellular protein and the amino acid sequence of SEQ ID 
NO:9 on the CBP cellular protein. 

8. A polypeptide consisting of the amino acid sequence of SEQ ID N0:4. 

9. A nucleic acid consisting of the nucleotide sequence of SEQ ID NO: 10. 

10. A nucleic acid having a nucleotide sequence which encodes the protein of claim 
1. 
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11. A nucleic acid having a nucleotide sequence which encodes the protein of claim 

12. A nucleic acid having a nucleotide sequence which encodes the protein of claim 

3. 

13. A nucleic acid consisting of the nucleotide sequence which encodes the protein 
of claim 4. 

14. A nucleic acid complementary to and which selectively hybridizes with the 
nucleic acid of claim 1 1 under stringent hybridization conditions. 

15. A fragment of the nucleic acid of claim 9, which encodes a polypeptide that 
acetylates histones. 

16. A fragment of the nucleic acid of claim 9; which encodes a polypeptide which 
binds to the amino acid sequence of SEQ ID NO: 3 on the p300 cellular protein and the 
amino acid sequence of SEQ ID NO:9 on the CBP cellular protein. 

17. A purified antibody which specifically binds the protein of claim 1 . 

18. A purified antibody which specifically binds the protein of claim 2. 

19. A purified antibody which specifically binds the protein of claim 3. 

20. A purified antibody which specifically binds the protein of claim 4. 

21. An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of P/CAF comprising: 

a) contacting the substance with a system in which histone acetylation by 
P/CAF can be determined; 
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b) deteiroining the amount of histone acetylation by P/CAF in the 
presence of the substance, and 

c) comparing the amount of histone acetylation by P/CAF in the 
presence of the substance with the amount of histone acetylation by P/CAF in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
P/CAF in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of P/CAF. 

22. An assay for screening substances for the ability to inhibit binding of P/CAF to 
p300/CBP comprising: 

a) contacting the substance with a system in which the P/CAF binding of 
P300/CBP can be determined; 

b) determining the amount of P/CAF binding of p300/CBP in the presence of 
the substance; and 

c) comparing the amount of binding of P/CAF to p300/CBP in the presence of 
.the substance with the amount of binding of P/CAF to p300/CBP in the absence of the 
substance, a decreased amount of binding of P/CAF to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of P/CAF to 
p300/CBP. 

23. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the p300 protein comprising amino acid residues 
1767-1816 (SEQ ID NO:3) and the protein of claim 4. 

24. The method of claim 22, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising amino acid residues 
1805-1854 (SEQ ID N0:9) and the protein of claim 4. 

25. The method of claim 22, wherein the system consists of a cell extract produced 
from cells producing both p300 and P/CAF 
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26. An assay for screening substances for the ability to inhibit or stimulate the 
histone acetyltransferase activity of p300/CBP comprising: 

a) contacting the substance with a system in which histone acetylation by 
p300/CBP can be determined; 

b) determining the amount of histone acetylation by p300/GBP in the 
presence of the substance; and - • 

c) comparing the amount of histone acetylation by p300/CBP in the 
presence of the substance with the amount of histone acetylation by p300/CBP in the 
absence of the substance, a decreased or increased amount of histone acetylation by 
p300/CBP in the presence of the substance indicating a substance that can inhibit or 
stimulate, respectively, the histone acetyltransferase activity of p300/CBP. 

27. An assay for screening substances for the ability to inhibit binding of a DNA- 
binding transcription factor to p300/CBP comprising: 

a) contacting the substance with a system in which the DNA-binding 
transcription factor binding of P300/CBP can be determined; 

b) determining the amount of DNA-binding transcription factor binding of 
p300/CBP in the presence of the substance; and 

c) comparing the amount of binding of DNA-binding transcription factor to 
p300/CBP in the presence of the substance with the amount of binding of DNA-binding 
transcription factor to p300/CBP in the absence of the substance, a decreased amount of 
binding of DNA-binding transcription factor to p300/CBP in the presence of the 
substance indicating a substance that can inhibit the ability to inhibit binding of DNA- 
binding transcription factor to p300/CBP. 

28. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a DNA-binding transcription factor and p300/CBP. 



29. The method of claim 27, wherein the system consists of a cell free reaction 
mixture comprising a fragment of the CBP protein comprising a DNA-binding 
transcription factor and p300/CBP. 
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30. The method of claim 27, wherein the system consists of a cell extract produced 
from cells producing both a DNA-binding transcription factor and p300/CBP. 

3 1 . The method of claim 27, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YY 1 , Sap- 1 a, c-Fos, MyoD and SRC- 1 . 

32. A method for inhibiting the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
inhibiting amount of a substance in a pharmaceutically acceptable carrier. 

33. The method of claim 32, wherein the substance can inhibit the transcription 
modulating activity of P/CAF by preventing the binding of P/CAF to p300/CBP. 

34. A method for stimulating the transcription modulating activity of P/CAF in a 
subject, comprising administering to the subject a transcription modulating activity 
stimulating amount of a substance in a pharmaceutically acceptable carrier. 

35. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by promoting the binding of P/CAF to p300/CBP 

36. The method of claim 34, wherein the substance can stimulate the transcription 
modulating activity of P/CAF by stimulating the histone acetlytransferase activity of 



P/CAF. 



37. A method for inhibiting the histone acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone. acetyltransferase activity 
inhibiting amount of a substance in a pharmaceutically acceptable carrier. 
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38. The method of claim 37, wherein the substance can inhibit the transcription 
modulating activity of p300/CBP by preventing the binding of a DNA-binding 
transcription factor to p300/CBP. 

39. The method of claim 38, wherein the DNA-binding transcription factor is 
selected from the group consisting of a nuclear hormone receptor, CREB, c-Jun/v-Jun, 
c-Myb/v-Myb, YYl, Sap- la, c-Fos, MyoD and SRC-1. 

40. The method of claim 37, wherein the substance is an antibody which binds 
p300/CBP. 

41. A method for stimulating the hist one acetyltransferase activity of p300/CBP in a 
subject, comprising administering to the subject a histone acetyltransferase activity 
stimulating amount of a substance in a pharmaceutically acceptable carrier. 

42. The method of claim 41, wherein the substance can stimulate the histone 
acetyltransferase activity of p300/CBP by promoting the binding of a DNA-binding 
transcription factor to p300/CBP. 
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